beam icon indicating copy to clipboard operation
beam copied to clipboard

Refactor commit logic out of StreamingDataflowWorker

Open m-trieu opened this issue 1 year ago • 3 comments

StreamingDataflowWorker is a rather large class

For maintainability it would be helpful in the long run to encapsulate different pieces of the worker harness logic into their own class.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • [ ] Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • [ ] Update CHANGES.md with noteworthy changes.
  • [ ] If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels Python tests Java tests Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

m-trieu avatar Feb 14 '24 08:02 m-trieu

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

github-actions[bot] avatar Feb 14 '24 09:02 github-actions[bot]

R: @scwhittle still working on the tests but curious about initial thoughts. also opted to offload the thread management to executors instead of using raw threads. This may allow us to play around with the thread pool implementation to possibly increase performance (i.e work stealing thread pool).

Thanks!

m-trieu avatar Feb 15 '24 01:02 m-trieu

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

github-actions[bot] avatar Feb 15 '24 01:02 github-actions[bot]

@scwhittle ready for a look! thanks

m-trieu avatar Feb 21 '24 22:02 m-trieu

fyi today is release cut. What is the status of this PR? I see the comments are all replied? CC: @scwhittle @m-trieu

Abacn avatar Mar 06 '24 17:03 Abacn

There are still some open comments. This is part of work to support direct path which won't be ready for the cut. So I don't think this should hold up the cut.

scwhittle avatar Mar 06 '24 17:03 scwhittle

ready for another look @scwhittle thanks!

m-trieu avatar Mar 07 '24 18:03 m-trieu

presubmit failure is due to flaky test (working on a PR to fix) and not related.

m-trieu avatar Mar 08 '24 02:03 m-trieu

test flakiness attempted to be fixed here #30572

m-trieu avatar Mar 08 '24 08:03 m-trieu

@scwhittle any more comments here? Thank you!

m-trieu avatar Mar 12 '24 08:03 m-trieu

added drain (fail work and clear queues) when we close the SEWorkCommitter

will need this for direct path. in non direct path mode we would never call stop (except when we call stop on the entire harness in StreamingDataflowWorker) so it will not affect that code path. @scwhittle

m-trieu avatar Mar 12 '24 19:03 m-trieu

Run Java PreCommit

scwhittle avatar Mar 13 '24 14:03 scwhittle

test failures should be fixed in https://github.com/apache/beam/pull/30572 since it removes arbitrary waiting from the tests (replaced with countdownlatch as event triggers)

m-trieu avatar Mar 14 '24 06:03 m-trieu

The other PR is merged, woudl like to verify this passes now

scwhittle avatar Mar 14 '24 09:03 scwhittle

Run Java PreCommit

scwhittle avatar Mar 14 '24 09:03 scwhittle

wooh! looks good

m-trieu avatar Mar 14 '24 19:03 m-trieu