Data Sync & Flows
Flows orchestrate data movement from Connectors into your databases. They handle scheduling, chunking, error recovery, and progress tracking.
How Flows Work
Section titled “How Flows Work”Connector → Fetch chunk → Upsert to destination → Save cursor → Next chunkEach flow run:
- Reads the last saved cursor for the entity
- Fetches the next chunk of records from the connector
- Upserts records into the destination database
- Saves the new cursor position
- Repeats until no more records
Change Data Capture (CDC) & Streaming
Section titled “Change Data Capture (CDC) & Streaming”In addition to scheduled batch syncing, Mako supports experimental Change Data Capture (CDC) for near real-time updates.
- Streaming Sync — continuous event consumption via webhooks or log streams
- Backfills — historical data backfills run robustly within 1Gi Cloud Run memory limits, safely handling bulk flushes by cycling DuckDB instances
- BigQuery Staging — streams events into region-aligned BigQuery staging tables (safely preserved during recovery)
Schema Evolution (BigQuery)
Section titled “Schema Evolution (BigQuery)”When a connector’s expected column types drift from the live BigQuery table (for example, a column created as STRING in a legacy run that should now be TIMESTAMP), Mako auto-corrects the drift before merging CDC events. This prevents merge failures from type mismatches.
For each drifted column, Mako runs a safe four-step swap:
ADD COLUMNa temporary column with the expected typeUPDATEthe temp column withSAFE_CASTof the existing valuesRENAMEthe original column to a_bak_*backup and the temp into its place (atomic)DROPthe backup column
Drift detection and correction is best-effort: if any step fails for a column, the merge falls back to a SAFE_CAST guard using the existing live type so the sync still completes.
The console surfaces drift in the Backfill Panel with an auto-correction notice per affected entity. Under the hood this calls the sync-cdc/schema-health endpoint (see API Reference) which compares each live column’s data_type from INFORMATION_SCHEMA.COLUMNS against the connector schema.
Job Queue
Section titled “Job Queue”Flows run on Inngest, a job queue that handles:
- Scheduled execution (cron-based)
- Automatic retries on failure
- Concurrency limits per workspace
- Progress tracking and logging
The Inngest dev server runs locally at http://localhost:8288 during development.
You can trigger syncs from the command line:
# Run a specific syncpnpm run sync --connector stripe --entity customers
# Run all syncs for a workspacepnpm run sync --workspace <workspace-id>Error Handling
Section titled “Error Handling”- Syncs are idempotent — re-running won’t create duplicates (upsert-based writes)
- Cursor is saved after each successful chunk, so failures resume from the last checkpoint
- Failed syncs are retried automatically by Inngest with exponential backoff