pinot Minion Task to support automatic Segment Refresh

Currently, when new columns are added or indexes are added/removed, the segment reloads happen on the server. There are a number of issues with this approach:

Increased startup times for Pinot Server hosts. Servers have to reload segments (generating indexes, columns) everytime at server startup. This is particularly exacerbated for Upsert tables. cc: @tibrewalpratik17 @ankitsultana
The server reload compute cost is paid on each server when indexes/colums are added. This leads to over-provisioning of servers to account for this compute cost.
Reload on servers when queries are being processed affects latencies.
Takes a long time to reload all segments (default value of 1 segment at a time). Increasing the concurrency affects query latencies.
The segment on the deepstore never contains the new indexes/columns. So the segment in deepstore is at divergence from the server (making it not ideal for disaster recovery).

This PR creates a minion task to automatically refresh segments when there are index/column updates to table config/schema. It can support automatic refresh for the following operations:

Adding/Removing indexes
Adding columns
Changing compatible datatypes.
Converting segment versions

Followup Work:

When there are table config/schema updates, we can validate if the datatype changes for columns are compatible. We can allow compatible updates.
Schedule the SegmentRefresh tasks when there are tableconfig/schema updates rather than waiting for the next iteration of periodic job.

Tested using integration tests.

Oct 24 '24 21:10 vvivekiyer

@vvivekiyer : the idea is quite interesting and the value add I see here is:

We can increase concurrency for segment refresh thereby reducing the total time to reload all segments
Deepstore link can be updated with the new segment
Possible perf improvements due to less work done in servers, but I guess we need to test this out.

This is particularly exacerbated for Upsert tables

But for Upserts I think one of the biggest cost is recomputing the validDocId map, so for Upsert tables we won't see any specific benefits right? (outside of the ones which are applicable for Realtime tables too).

Oct 24 '24 22:10 ankitsultana

Codecov Report

Attention: Patch coverage is 3.55030% with 163 lines in your changes missing coverage. Please review.

Project coverage is 63.71%. Comparing base (59551e4) to head (0679ba3). Report is 1362 commits behind head on master.

Files with missing lines	Patch %	Lines
...sks/refreshsegment/RefreshSegmentTaskExecutor.java	0.00%	82 Missing :warning:
...ks/refreshsegment/RefreshSegmentTaskGenerator.java	0.00%	62 Missing :warning:
...che/pinot/plugin/minion/tasks/MinionTaskUtils.java	0.00%	5 Missing :warning:
...reshsegment/RefreshSegmentTaskExecutorFactory.java	0.00%	5 Missing :warning:
.../org/apache/pinot/core/common/MinionConstants.java	0.00%	3 Missing :warning:
...ntroller/helix/core/PinotHelixResourceManager.java	75.00%	1 Missing and 1 partial :warning:
...ent/RefreshSegmentTaskProgressObserverFactory.java	0.00%	2 Missing :warning:
...oller/api/resources/PinotTableRestletResource.java	0.00%	1 Missing :warning:
...inot/spi/config/table/TableStatsHumanReadable.java	0.00%	1 Missing :warning:

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #14300      +/-   ##
============================================
+ Coverage     61.75%   63.71%   +1.96%     
- Complexity      207     1567    +1360     
============================================
  Files          2436     2672     +236     
  Lines        133233   146635   +13402     
  Branches      20636    22487    +1851     
============================================
+ Hits          82274    93435   +11161     
- Misses        44911    46281    +1370     
- Partials       6048     6919     +871

Flag	Coverage Δ
custom-integration1	`100.00% <ø> (+99.99%)`	:arrow_up:
integration	`100.00% <ø> (+99.99%)`	:arrow_up:
integration1	`100.00% <ø> (+99.99%)`	:arrow_up:
integration2	`0.00% <ø> (ø)`
java-11	`63.66% <3.55%> (+1.95%)`	:arrow_up:
java-21	`63.60% <3.55%> (+1.97%)`	:arrow_up:
skip-bytebuffers-false	`63.69% <3.55%> (+1.94%)`	:arrow_up:
skip-bytebuffers-true	`63.56% <3.55%> (+35.83%)`	:arrow_up:
temurin	`63.71% <3.55%> (+1.96%)`	:arrow_up:
unittests	`63.71% <3.55%> (+1.96%)`	:arrow_up:
unittests1	`55.56% <0.00%> (+8.67%)`	:arrow_up:
unittests2	`34.05% <3.55%> (+6.32%)`	:arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

🚨 Try these New Features:

Flaky Tests Detection - Detect and resolve failed and flaky tests
JS Bundle Analysis - Avoid shipping oversized bundles

Oct 24 '24 22:10 codecov-commenter

We can increase concurrency for segment refresh thereby reducing the total time to reload all segments

Are you suggesting an increase in concurrency at the minion level or the server level? At the server level, it seems we would still issue a SegmentRefreshTask, which means the default concurrency would remain at 1. We can investigate performance improvements that might allow us to adjust the concurrency configuration.

Overall, this appears to be a valuable feature to reduce index build time and associated costs for servers! However, we need to consider the trade-off between SegmentRefresh and SegmentReload costs. Ultimately, we would still issue a SegmentRefresh call to the servers, if I understand correctly. For upsert tables with snapshot enabled, we risk losing validDocIDSnapshot during downloads from deep store since deep store lacks snapshot copies. This could potentially increase refresh times for these tables, as we wouldn't be able to utilize the preload feature.

Oct 25 '24 08:10 tibrewalpratik17

But for Upserts I think one of the biggest cost is recomputing the validDocId map, so for Upsert tables we won't see any specific benefits right? (outside of the ones which are applicable for Realtime tables too). AND For upsert tables with snapshot enabled, we risk losing validDocIDSnapshot during downloads from deep store since deep store lacks snapshot copies

Yes, that's right. Exploring possibilities here - if we couple segment refresh minion task to also do other things (like upsert compaction, etc), will that help?

Oct 28 '24 23:10 vvivekiyer

if we couple segment refresh minion task to also do other things (like upsert compaction, etc), will that help?

The benefits of including compaction in this task will vary from use-case to use-case depending on the number of invalid docIDs. cc @klsince too on ideas for upsert

Oct 29 '24 08:10 tibrewalpratik17

Thanks for capturing our discussion in the description. Could you also create a github issue for the follow up task to track?

To overcome this problem, we can use a server side API that will return the list of segments to be refreshed. It is being developed in https://github.com/apache/pinot/issues/14450. We can incorporate these changes in the Task Generation Flow once it is merged

Nov 15 '24 11:11 swaminathanmanish

Created followup issue https://github.com/apache/pinot/issues/14483

Nov 18 '24 20:11 vvivekiyer

pinot pinot copied to clipboard

Minion Task to support automatic Segment Refresh

Codecov Report

pinot
pinot copied to clipboard