pinot icon indicating copy to clipboard operation
pinot copied to clipboard

Update deepstore segments with schema/tableConfig changes

Open vvivekiyer opened this issue 2 years ago • 1 comments

Currently, we support a number of preprocessing operations for a segment in response to schema/tableConfig changes. Some of them are:

  1. Add a new column. Remove/Modify an autogenerated column.
  2. Add a new index, remove an index.

Every time the server downloads and reloads a segment, the server preprocesses the segment and applies these changes. However, the segment directory in the deep store is never modified to reflect these schema changes. As we keep piling more segment preprocessing logic in reload path, time taken to reload a segment could increase if the user has a number of schema/tableConfig changes applied.

The suggestion here is to also update the segment in deep store to reflect these changes. This can be done with a background minion task.

vvivekiyer avatar Sep 09 '22 18:09 vvivekiyer

We already have an API to ask server to upload the segment to deep store. We may leverage the same mechanism to refresh the segments in the deep store. Currently it is used to fix the realtime segments that do not have the deep store copy. See PinotLLCRealtimeSegmentManager.uploadToDeepStoreIfMissing() for more details

Jackie-Jiang avatar Sep 09 '22 20:09 Jackie-Jiang