airbyte
airbyte copied to clipboard
Postgres on Resumable full refresh
Postgres on Resumable full refresh
- adapt to rfr cdk interface
- create state manager for rfr (final state handling)
The latest updates on your projects. Learn more about Vercel for Git ↗︎
1 Ignored Deployment
Name | Status | Preview | Comments | Updated (UTC) |
---|---|---|---|---|
airbyte-docs | ⬜️ Ignored (Inspect) | Visit Preview | May 10, 2024 4:00pm |
An empty tables saved a streamState: null
, causing the next sync to fail
Actually, even a table with small amount of records will first emit a null stream state. Not sure if that's the case with mssql and mysql also
With xmin: final state is not saved so the next full refresh sync will read the last 10,000 records chunk over and over
An empty tables saved a
streamState: null
, causing the next sync to failActually, even a table with small amount of records will first emit a null stream state. Not sure if that's the case with mssql and mysql also
it's because in postgres, unless we reach to the first checkpoint the streamState will be null. Not sure why it would cause to fail?
/publish-java-cdk
:clock2: https://github.com/airbytehq/airbyte/actions/runs/9023493316 :white_check_mark: Successfully published Java CDK version=0.34.2!
Hi @xiaohansong I tried Postgres with CDC and there was ctid for some streams and some have an empty cursor field, but once the sync failed and I started the sync again it fully refreshed again. Is this normal behavior?
@Hashcode-Ankit "resumable full refresh" only happens within the same sync job among attempts - that means if the sync job has a 2nd attempt it will pick up from the previous checkpoint of a full refresh stream, but if user kicks off a new sync job, regardless of the previous sync result, it will start full refresh from beginning.
If you do not wish to start from beginning consider using incremental refresh instead!
Hi @xiaohansong I think what @Hashcode-Ankit means here is that he's trying CDC with postgres and it's the first sync during the sync some streams are fully loaded but the cursor fields are missing for those streams, and the current running stream has a CTID state, and I think the sync failed at the same time.
When he ran the next sync ran with the same state, It's restarting the full-load for every stream, which it shouldn't had.