pinot
pinot copied to clipboard
Minion Task to support automatic Segment Refresh
Currently, when new columns are added or indexes are added/removed, the segment reloads happen on the server. There are a number of issues with this approach:
- Increased startup times for Pinot Server hosts. Servers have to reload segments (generating indexes, columns) everytime at server startup. This is particularly exacerbated for Upsert tables. cc: @tibrewalpratik17 @ankitsultana
- The server reload compute cost is paid on each server when indexes/colums are added. This leads to over-provisioning of servers to account for this compute cost.
- Reload on servers when queries are being processed affects latencies.
- Takes a long time to reload all segments (default value of 1 segment at a time). Increasing the concurrency affects query latencies.
- The segment on the deepstore never contains the new indexes/columns. So the segment in deepstore is at divergence from the server (making it not ideal for disaster recovery).
This PR creates a minion task to automatically refresh segments when there are index/column updates to table config/schema. It can support automatic refresh for the following operations:
- Adding/Removing indexes
- Adding columns
- Changing compatible datatypes.
- Converting segment versions
Followup Work:
- When there are table config/schema updates, we can validate if the datatype changes for columns are compatible. We can allow compatible updates.
- Schedule the SegmentRefresh tasks when there are tableconfig/schema updates rather than waiting for the next iteration of periodic job.
Tested using integration tests.
@vvivekiyer : the idea is quite interesting and the value add I see here is:
- We can increase concurrency for segment refresh thereby reducing the total time to reload all segments
- Deepstore link can be updated with the new segment
- Possible perf improvements due to less work done in servers, but I guess we need to test this out.
This is particularly exacerbated for Upsert tables
But for Upserts I think one of the biggest cost is recomputing the validDocId map, so for Upsert tables we won't see any specific benefits right? (outside of the ones which are applicable for Realtime tables too).
Codecov Report
Attention: Patch coverage is 3.55030% with 163 lines in your changes missing coverage. Please review.
Project coverage is 63.71%. Comparing base (
59551e4) to head (0679ba3). Report is 1362 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #14300 +/- ##
============================================
+ Coverage 61.75% 63.71% +1.96%
- Complexity 207 1567 +1360
============================================
Files 2436 2672 +236
Lines 133233 146635 +13402
Branches 20636 22487 +1851
============================================
+ Hits 82274 93435 +11161
- Misses 44911 46281 +1370
- Partials 6048 6919 +871
| Flag | Coverage Δ | |
|---|---|---|
| custom-integration1 | 100.00% <ø> (+99.99%) |
:arrow_up: |
| integration | 100.00% <ø> (+99.99%) |
:arrow_up: |
| integration1 | 100.00% <ø> (+99.99%) |
:arrow_up: |
| integration2 | 0.00% <ø> (ø) |
|
| java-11 | 63.66% <3.55%> (+1.95%) |
:arrow_up: |
| java-21 | 63.60% <3.55%> (+1.97%) |
:arrow_up: |
| skip-bytebuffers-false | 63.69% <3.55%> (+1.94%) |
:arrow_up: |
| skip-bytebuffers-true | 63.56% <3.55%> (+35.83%) |
:arrow_up: |
| temurin | 63.71% <3.55%> (+1.96%) |
:arrow_up: |
| unittests | 63.71% <3.55%> (+1.96%) |
:arrow_up: |
| unittests1 | 55.56% <0.00%> (+8.67%) |
:arrow_up: |
| unittests2 | 34.05% <3.55%> (+6.32%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
🚨 Try these New Features:
- Flaky Tests Detection - Detect and resolve failed and flaky tests
- JS Bundle Analysis - Avoid shipping oversized bundles
We can increase concurrency for segment refresh thereby reducing the total time to reload all segments
Are you suggesting an increase in concurrency at the minion level or the server level? At the server level, it seems we would still issue a SegmentRefreshTask, which means the default concurrency would remain at 1. We can investigate performance improvements that might allow us to adjust the concurrency configuration.
Overall, this appears to be a valuable feature to reduce index build time and associated costs for servers! However, we need to consider the trade-off between SegmentRefresh and SegmentReload costs. Ultimately, we would still issue a SegmentRefresh call to the servers, if I understand correctly. For upsert tables with snapshot enabled, we risk losing validDocIDSnapshot during downloads from deep store since deep store lacks snapshot copies. This could potentially increase refresh times for these tables, as we wouldn't be able to utilize the preload feature.
But for Upserts I think one of the biggest cost is recomputing the validDocId map, so for Upsert tables we won't see any specific benefits right? (outside of the ones which are applicable for Realtime tables too). AND For upsert tables with snapshot enabled, we risk losing validDocIDSnapshot during downloads from deep store since deep store lacks snapshot copies
Yes, that's right. Exploring possibilities here - if we couple segment refresh minion task to also do other things (like upsert compaction, etc), will that help?
if we couple segment refresh minion task to also do other things (like upsert compaction, etc), will that help?
The benefits of including compaction in this task will vary from use-case to use-case depending on the number of invalid docIDs. cc @klsince too on ideas for upsert
Thanks for capturing our discussion in the description. Could you also create a github issue for the follow up task to track?
To overcome this problem, we can use a server side API that will return the list of segments to be refreshed. It is being developed in https://github.com/apache/pinot/issues/14450. We can incorporate these changes in the Task Generation Flow once it is merged
Created followup issue https://github.com/apache/pinot/issues/14483