[Do not Merge] [WIP] Combine.perKeyWithBucketing: Combiner for reducing key cardinality
Combine.perKeyWithBucketing(childCombiner, numBuckets) applies the child
combiner to the PCollection using numberOfBuckets number of intermediate keys.
This is a POC, sending it now to share and get early feedback.
TODO: Add tests, pick better names.
Example usage
input.apply(Combine.perKeyWithBucketing(yourCombineFn, desiredNumKeys))
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 58.93%. Comparing base (
1e27978) to head (768981a). Report is 386 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #32831 +/- ##
============================================
+ Coverage 57.41% 58.93% +1.52%
- Complexity 1475 3102 +1627
============================================
Files 968 1131 +163
Lines 154224 174643 +20419
Branches 1076 3330 +2254
============================================
+ Hits 88546 102931 +14385
- Misses 63477 68373 +4896
- Partials 2201 3339 +1138
| Flag | Coverage Δ | |
|---|---|---|
| java | 69.91% <ø> (+1.33%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Closing this in favor of https://github.com/apache/beam/pull/33318