[Real-time Table Replication X clusters][1/n] Creating new tables with designated consuming segments
This the first part of a series of changes aimed at enabling real-time table replication between two Pinot clusters.
The core purpose of this specific PR is to introduce the necessary functionality for creating a new real-time table with consuming segment watermarks inducted from source table's ZK metadata and a designated broker / server tenant when performing a table copy operation.
The key changes introduced by the 3 commits are:
- Copy table with designated tenant and watermarks: Modifies the table copy mechanism to include the specification of a designated tenant and watermarks.
- Create first consuming segments per partition: Introduces logic to create the initial consuming segments for each partition based on the provided watermarks.
-
Induct the watermark of consuming status: Updates the consuming status and watermarks during the table copy process: Reuse the exsiting pinotHelixResourceManager api which will fetch the currently consuming status per partition. Basically, that api can dump the table's IdealState and output the last segment sequence number per partition by parsing the segment name.
- If the segment status is
IN_PROGRESS, we consider this segment is the CONSUMING segment, thus we copy the startOffset and sequence number as the start consuming position of that partition for new table. - If the segment status is
DONE, the consuming segment is the one next to it, thus we will use the endOffset and 1 + sequence_number to initialize the consuming segment for new table.
- If the segment status is
Doesn't support following feature at the moment:
- upsert / offline table replication.
- shallow copy of deep segment.
- auto pause the table if the consuming segment's start offset expired in kafka broker.
- auto re-upload segments if no deep url in the segment metadata.
- pauseless table.
Test We have a manual Test in Uber's pinot cluster. Confirm that new consuming segments of the target cluster copy the sequence number (rather than 0) and startOffset per partition when the latest segments in the source table are consuming segments. And the consuming status is Active.
Codecov Report
:x: Patch coverage is 37.28814% with 111 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 63.23%. Comparing base (82b7d2a) to head (cb65334).
:warning: Report is 22 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #17235 +/- ##
============================================
- Coverage 63.25% 63.23% -0.03%
- Complexity 1475 1477 +2
============================================
Files 3162 3167 +5
Lines 188668 189074 +406
Branches 28869 28918 +49
============================================
+ Hits 119348 119566 +218
- Misses 60074 60235 +161
- Partials 9246 9273 +27
| Flag | Coverage Δ | |
|---|---|---|
| custom-integration1 | 100.00% <ø> (ø) |
|
| integration | 100.00% <ø> (ø) |
|
| integration1 | 100.00% <ø> (ø) |
|
| integration2 | 0.00% <ø> (ø) |
|
| java-11 | 63.20% <37.28%> (-0.02%) |
:arrow_down: |
| java-21 | 63.21% <37.28%> (-0.02%) |
:arrow_down: |
| temurin | 63.23% <37.28%> (-0.03%) |
:arrow_down: |
| unittests | 63.23% <37.28%> (-0.03%) |
:arrow_down: |
| unittests1 | 55.57% <100.00%> (-0.03%) |
:arrow_down: |
| unittests2 | 34.03% <36.72%> (+0.03%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
@Jackie-Jiang @chenboat Could you take a review? The manual integration test gets done already. The purpose of this PR is to copy the schema and table config from the source cluster to the target cluster by replacing the server / broke tenant. And then copy the consuming segment to kick off the message consumption on the new table. The backfill of segments prior to consuming segments will be implemented in a follow up PR
The unit test Set-1 fail in java 11 but succeed in java 21. The error is about "selectionCombineOperator Not found issue", which is nothing related to table copy. I just do a rebase without any conflict resolving. Previous tests were all successful.
Walk through the design doc with @abhishekbafna on 12/23 and aligned on the direction of the solution.
@abhishekbafna I add the dryRun as a boolean flag in the payload. The reply contains schema, modified table config and watermarkResult as well. The write operation (schema and table config addition) is moved to the end.
MSE against 1.3 failed but succeed against 1.4 and master
thanks for contributing. LGTM, i'd suggest waiting another stamp from @chenboat before ship it!