pinot icon indicating copy to clipboard operation
pinot copied to clipboard

Ensure upsert deletion consistency when enabled with compaction flow

Open tibrewalpratik17 opened this issue 1 year ago • 1 comments

Problem statement

In #12037, we added support for removing deleted keys and the validDocId for the deleted records after a TTL window. This ensured that deleted records would eventually be picked up by the compaction flow. In #13342, we directly enabled compaction to take care of deleted rows. [Reverted]

In both scenarios, we could end up in a situation where an older non-deleted record for a particular key is not compacted, but the deleted record is. During a server restart, when all segments are loaded, Pinot would incorrectly mark the record as non-deleted and start returning it as a valid primary key, leading to an inconsistent state in the table.

Example:

Consider a primary key PK1 with records in segments S0 and S1. In S1, the record is marked as deleted. If S1 gets compacted but S0 doesn't due to threshold reasons in the upsert-compaction flow, during a server restart, the upsert-metadata-manager map would incorrectly point PK1 to S0, even though it should be considered deleted for the end-user.

Why not use enablePreload? While enablePreload might prevent loading previous invalid records, it is not foolproof. Valid-doc-id snapshots are not always available due to scenarios like fetching segments from deepstore or upsert-compaction refreshing the segment. If a validDocID snapshot is missing and enablePreload is set to true, the segment loads normally, iterating through records and would potentially end-up reviving a deleted primary key with an old value with the present flow.

Proposal

This PR aims to maintain the state of PK -> distinct-segment-count. This means tracking the number of segments in which a record exists for a given primary key. If the count is <= 1, we can compact the deleted record, ensuring that all other records in other segments are removed.

Moved the proposal to this doc updated based on offline discussions as well.

cc @klsince @Jackie-Jiang raising this draft PR for early feedback on the proposal.

tibrewalpratik17 avatar Jun 10 '24 10:06 tibrewalpratik17

Codecov Report

Attention: Patch coverage is 81.59204% with 37 lines in your changes missing coverage. Please review.

Project coverage is 61.96%. Comparing base (59551e4) to head (e09c628). Report is 1557 commits behind head on master.

Files with missing lines Patch % Lines
...tionUpsertMetadataManagerForConsistentDeletes.java 82.03% 16 Missing and 14 partials :warning:
...psert/ConcurrentMapTableUpsertMetadataManager.java 33.33% 3 Missing and 1 partial :warning:
...he/pinot/segment/local/utils/TableConfigUtils.java 75.00% 1 Missing and 1 partial :warning:
...ache/pinot/segment/local/upsert/UpsertContext.java 88.88% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #13347      +/-   ##
============================================
+ Coverage     61.75%   61.96%   +0.21%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2559     +123     
  Lines        133233   141419    +8186     
  Branches      20636    22010    +1374     
============================================
+ Hits          82274    87634    +5360     
- Misses        44911    47106    +2195     
- Partials       6048     6679     +631     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) :arrow_down:
integration <0.01% <0.00%> (-0.01%) :arrow_down:
integration1 <0.01% <0.00%> (-0.01%) :arrow_down:
integration2 0.00% <0.00%> (ø)
java-11 61.91% <81.59%> (+0.20%) :arrow_up:
java-21 61.85% <81.59%> (+0.23%) :arrow_up:
skip-bytebuffers-false 61.95% <81.59%> (+0.20%) :arrow_up:
skip-bytebuffers-true 61.81% <81.59%> (+34.08%) :arrow_up:
temurin 61.96% <81.59%> (+0.21%) :arrow_up:
unittests 61.96% <81.59%> (+0.21%) :arrow_up:
unittests1 46.27% <1.49%> (-0.63%) :arrow_down:
unittests2 27.89% <80.09%> (+0.16%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Jun 10 '24 10:06 codecov-commenter

Good job! Can you also help update the pinot doc to reflect this new feature?

Jackie-Jiang avatar Aug 16 '24 00:08 Jackie-Jiang