Data Sync & Flows
Flows orchestrate data movement from Connectors into your databases. They handle scheduling, chunking, error recovery, and progress tracking.
How Flows Work
Section titled “How Flows Work”Connector → Fetch chunk → Upsert to destination → Save cursor → Next chunkEach flow run:
- Reads the last saved cursor for the entity
- Fetches the next chunk of records from the connector
- Upserts records into the destination database
- Saves the new cursor position
- Repeats until no more records
Change Data Capture (CDC) & Streaming
Section titled “Change Data Capture (CDC) & Streaming”In addition to scheduled batch syncing, Mako supports experimental Change Data Capture (CDC) for near real-time updates.
- Streaming Sync — continuous event consumption via webhooks or log streams
- Backfills — historical data backfills run robustly within 1Gi Cloud Run memory limits, safely handling bulk flushes by cycling DuckDB instances
- BigQuery Staging — streams events into region-aligned BigQuery staging tables (safely preserved during recovery)
Schema Evolution (BigQuery)
Section titled “Schema Evolution (BigQuery)”When a connector’s expected column types drift from the live BigQuery table (for example, a column created as STRING in a legacy run that should now be TIMESTAMP), Mako auto-corrects the drift before merging CDC events. This prevents merge failures from type mismatches.
For each drifted column, Mako runs a safe four-step swap:
ADD COLUMNa temporary column with the expected typeUPDATEthe temp column withSAFE_CASTof the existing valuesRENAMEthe original column to a_bak_*backup and the temp into its place (atomic)DROPthe backup column
Drift detection and correction is best-effort: if any step fails for a column, the merge falls back to a SAFE_CAST guard using the existing live type so the sync still completes.
The console surfaces drift in the Backfill Panel with an auto-correction notice per affected entity. Under the hood this calls the sync-cdc/schema-health endpoint (see API Reference) which compares each live column’s data_type from INFORMATION_SCHEMA.COLUMNS against the connector schema.
Destination Row Counts
Section titled “Destination Row Counts”The Backfill Panel shows destination row totals next to CDC progress. Counts are fetched lazily when the panel opens and when you click the refresh icon beside Destination rows; they do not poll continuously.
For BigQuery and PostgreSQL destinations, Mako batches all entity counts into a single metadata query and caches the result briefly. Missing destination tables are shown as 0 rows.
Job Queue
Section titled “Job Queue”Flows run on Inngest, a job queue that handles:
- Scheduled execution (cron-based)
- Automatic retries on failure
- Concurrency limits per workspace
- Progress tracking and logging
The Inngest dev server runs locally at http://localhost:8288 during development.
You can trigger syncs from the command line:
# Run a specific syncpnpm run sync --connector stripe --entity customers
# Run all syncs for a workspacepnpm run sync --workspace <workspace-id>Error Handling
Section titled “Error Handling”- Syncs are idempotent — re-running won’t create duplicates (upsert-based writes)
- Cursor is saved after each successful chunk, so failures resume from the last checkpoint
- Failed syncs are retried automatically by Inngest with exponential backoff