airbyte
airbyte copied to clipboard
[airbyte-cdk] 🐛 Allow usage of GlobalStateCursor for RFR substreams that support incremental_dependency
https://github.com/airbytehq/oncall/issues/6464
What
For the affected connection, we were failing due to a heartbeat. However, after investigating, my hypothesis is that the combination of a very large number of parent records, very long gaps in between parent records that have children, and the increased size of the state slowing down the sync has stopped the sync from being able to make progress.
How
We can unblock certain types of RFR substreams that have a high volume of parent records if the parent is incremental. This is because an incremental parent with records that are updated due to changes in the child can use a global state cursor instead per-parent partition success tracking. That will significantly reduce the size of the state message which will get infinitely bigger and allow the sync to progress before the heartbeat times out.
For this custom API endpoint we can't control large gaps between parents since it is API dependent.
Review guide
-
model_to_component_factory.py
-
substream_partition_router.py
User Impact
Should be none. incremental_dependency
doesn't have any usage in our repos last I checked
Can this PR be safely reverted and rolled back?
- [x] YES 💚
- [ ] NO ❌