beam icon indicating copy to clipboard operation
beam copied to clipboard

Initial Commit for AvroPayloadSerializer

Open talatuyarer opened this issue 3 years ago • 18 comments

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • [ ] Choose reviewer(s) and mention them in a comment (R: @username).
  • [ ] Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • [ ] Update CHANGES.md with noteworthy changes.
  • [ ] If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels Python tests Java tests

See CI.md for more information about GitHub Actions CI.

talatuyarer avatar Jul 19 '22 22:07 talatuyarer

R: @TheNeuralBit Could you review this ? This is initial commit. I will continue develop based on your feedbacks. Thanks

talatuyarer avatar Jul 19 '22 22:07 talatuyarer

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

github-actions[bot] avatar Jul 19 '22 22:07 github-actions[bot]

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 74.34%. Comparing base (d54841c) to head (bfa5999). Report is 6632 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #22348      +/-   ##
==========================================
+ Coverage   74.24%   74.34%   +0.10%     
==========================================
  Files         702      703       +1     
  Lines       92999    93627     +628     
==========================================
+ Hits        69045    69610     +565     
- Misses      22687    22750      +63     
  Partials     1267     1267              
Flag Coverage Δ
python 83.67% <ø> (+0.06%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jul 19 '22 22:07 codecov[bot]

I'd be interested to see the performance of this vs FastReaderBuilder.createDatumReader and wrapping the resulting object in a Row.

steveniemitz avatar Jul 20 '22 12:07 steveniemitz

I'd be interested to see the performance of this vs FastReaderBuilder.createDatumReader and wrapping the resulting object in a Row.

That documentation is sparse, is there more somewhere?

TheNeuralBit avatar Jul 20 '22 23:07 TheNeuralBit

I'd be interested to see the performance of this vs FastReaderBuilder.createDatumReader and wrapping the resulting object in a Row.

Thank you for reviewing my pr @steveniemitz Let me write a JMH test and share result. I will reply your other comments after JMH test result sounds good ?

talatuyarer avatar Jul 21 '22 00:07 talatuyarer

That documentation is sparse, is there more somewhere?

I'm not sure actually, I'd stumbled across it a year or two ago because I was doing something very similar to what's being done here. https://issues.apache.org/jira/browse/AVRO-2247 is the most detail I can find on the work being done to improve the performance.

I realize now after writing that comment though that this wasn't in the java avro library until 1.10, and beam is still on 1.8 (or maybe 1.9?), so it might not be particularly relevant anyways.

steveniemitz avatar Jul 21 '22 01:07 steveniemitz

That documentation is sparse, is there more somewhere?

I'm not sure actually, I'd stumbled across it a year or two ago because I was doing something very similar to what's being done here. https://issues.apache.org/jira/browse/AVRO-2247 is the most detail I can find on the work being done to improve the performance.

I realize now after writing that comment though that this wasn't in the java avro library until 1.10, and beam is still on 1.8 (or maybe 1.9?), so it might not be particularly relevant anyways.

Ah ok. If we run some benchmarks now and it looks good we could file an issue to consider switching to it when/if we upgrade Avro. For reference #19969 is the only avro upgrade issue I can find.

TheNeuralBit avatar Jul 21 '22 16:07 TheNeuralBit

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Jan 05 '23 12:01 github-actions[bot]

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Mar 07 '23 12:03 github-actions[bot]

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Jun 08 '23 12:06 github-actions[bot]

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Aug 09 '23 12:08 github-actions[bot]

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Oct 10 '23 12:10 github-actions[bot]

Good day, @talatuyarer. Thank you again for making this contribution! Would you like to still work on this PR or should I close it? We appreciate what you do for the Beam community!

damondouglas avatar Nov 16 '23 17:11 damondouglas

Hi @damondouglas Let me work on it. I want to merge this MR to BEAM. It makes Beam's AVRO serialization more than 3x faster.

talatuyarer avatar Nov 16 '23 18:11 talatuyarer

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Jan 16 '24 12:01 github-actions[bot]

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Mar 18 '24 12:03 github-actions[bot]

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar May 18 '24 12:05 github-actions[bot]

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Jul 18 '24 12:07 github-actions[bot]

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Sep 17 '24 12:09 github-actions[bot]

This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

github-actions[bot] avatar Sep 24 '24 12:09 github-actions[bot]