sdk icon indicating copy to clipboard operation
sdk copied to clipboard

feat(taps): Queue parent contexts and sync child streams only when the queue is full

Open edgarrmondragon opened this issue 7 months ago • 6 comments

Summary by Sourcery

Introduce batching of child stream synchronization by queueing parent contexts and flushing them when the queue reaches a configurable maximum or at the end of a parent sync.

New Features:

  • Enable delayed syncing of child streams by queuing parent contexts instead of immediate synchronization.

Enhancements:

  • Add a configurable QUEUE_MAX_SIZE attribute (default 1000) to control batch size of queued contexts.
  • Implement an internal _child_context_queue and a _flush_child_context_queue method to process queued contexts in bulk.
  • Modify _sync_children and _sync_records to enqueue contexts and trigger flushes when the queue is full or after parent sync ends.

Tests:

  • Adjust existing parent-child tests to override QUEUE_MAX_SIZE and verify batched flushing behavior.

edgarrmondragon avatar May 23 '25 18:05 edgarrmondragon

Reviewer's Guide

Implements batched child-stream synchronizations by queuing parent contexts in a list and flushing them in bulk when a maximum queue size is reached or after the parent stream finishes syncing.

Sequence diagram for batched child stream synchronization

sequenceDiagram
    participant ParentStream
    participant ChildContextQueue
    participant ChildStream
    loop For each parent context
        ParentStream->>ChildContextQueue: Add context to queue
        alt Queue size >= QUEUE_MAX_SIZE
            ParentStream->>ChildContextQueue: Flush queue
            ChildContextQueue->>ChildStream: Sync all queued contexts
            ChildContextQueue->>ChildContextQueue: Clear queue
        end
    end
    ParentStream->>ChildContextQueue: Flush any remaining contexts (after parent sync)
    ChildContextQueue->>ChildStream: Sync all remaining contexts
    ChildContextQueue->>ChildContextQueue: Clear queue

Class diagram for updated Stream batching logic

classDiagram
    class Stream {
        +int QUEUE_MAX_SIZE
        +list _child_context_queue
        +_sync_children(child_context)
        +_flush_child_context_queue()
    }
    Stream --> "*" Stream : child_streams

File-Level Changes

Change Details Files
Introduce queue size limit and context queue for batching child syncs
  • Define a new QUEUE_MAX_SIZE class attribute
  • Initialize a list-based _child_context_queue in Stream.init
singer_sdk/streams/core.py
Enqueue child contexts instead of syncing immediately
  • Replace direct child_sync loop in _sync_children
  • Append child_context to queue and check against QUEUE_MAX_SIZE
singer_sdk/streams/core.py
Add a flush method to process queued contexts
  • Implement _flush_child_context_queue to log and sync batches
  • Clear the queue after processing
singer_sdk/streams/core.py
Ensure final flush after parent sync
  • Invoke _flush_child_context_queue at the end of _sync_records
singer_sdk/streams/core.py
Adjust tests to reflect batched syncing
  • Override QUEUE_MAX_SIZE in test Parent class
  • Regenerate snapshots for batched behavior
tests/core/test_parent_child.py
tests/core/snapshots/test_parent_child/*

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an issue from a review comment by replying to it. You can also reply to a review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull request title to generate a title at any time. You can also comment @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in the pull request body to generate a PR summary at any time exactly where you want it. You can also comment @sourcery-ai summary on the pull request to (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the pull request to resolve all Sourcery comments. Useful if you've already addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull request to dismiss all existing Sourcery reviews. Especially useful if you want to start fresh with a new review - don't forget to comment @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

  • Contact our support team for questions or feedback.
  • Visit our documentation for detailed guides and information.
  • Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai[bot] avatar May 23 '25 18:05 sourcery-ai[bot]

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 93.82%. Comparing base (e6c0123) to head (6c86fe7). :warning: Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3058      +/-   ##
==========================================
+ Coverage   93.80%   93.82%   +0.01%     
==========================================
  Files          69       69              
  Lines        5778     5794      +16     
  Branches      718      721       +3     
==========================================
+ Hits         5420     5436      +16     
  Misses        254      254              
  Partials      104      104              
Flag Coverage Δ
core 81.73% <100.00%> (+0.10%) :arrow_up:
end-to-end 76.28% <36.84%> (-0.10%) :arrow_down:
optional-components 43.38% <15.78%> (-0.07%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar May 23 '25 18:05 codecov[bot]

CodSpeed Performance Report

Merging #3058 will not alter performance

Comparing queue-child-streams (6c86fe7) with main (d8afc48)[^unexpected-base] [^unexpected-base]: No successful run was found on main (e6c0123) during the generation of this report, so d8afc48 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

Summary

✅ 8 untouched

codspeed-hq[bot] avatar May 23 '25 18:05 codspeed-hq[bot]

@sourcery-ai review

edgarrmondragon avatar Jun 10 '25 21:06 edgarrmondragon

@sourcery-ai review

edgarrmondragon avatar Jul 01 '25 00:07 edgarrmondragon

Documentation build overview

📚 Meltano SDK | 🛠️ Build #30357529 | 📁 Comparing 6c86fe775b9d584645f93e809bf26cf4cab82e65 against latest (e6c0123a7fbe031ffcf6ee318091bf5cf16a6509)


🔍 Preview build

Show files changed (2 files in total): 📝 2 modified | ➕ 0 added | ➖ 0 deleted
File Status
genindex.html 📝 modified
classes/singer_sdk.Stream.html 📝 modified