Add API to update ZK metadata of a segment
Looking at BaseSingleSegmentConversionExecutor, I see that it always uploads the segment, even if the segment did not change but only the metadata (e.g., updating custom map for some minion task ran on the segment). The controller then checks that the segment did not change and only updates ZK.
The network transfer of the segment from minion task to controller seems wasteful in this case, since the minion task should already know that the segment did not change. If there is an API to simply update ZK metadata, then that will be more efficient.
Hi, can I pick this up?
@harold-kfuse Can you point me to the controller where the unchanged segment check is made? I am unable to find it.
@maahir22
https://github.com/apache/pinot/blob/d97f2d92bc6cdb1f6bafb7fd98f69afae1266caa/pinot-controller/src/main/java/org/apache/pinot/controller/api/upload/ZKOperator.java#L198
@harold-kfuse, based on my analysis I see that
- PinotSegmentUploadDownloadRestletResource already has code to handle segment METADATA, I assume the same endpoint can be used for
BaseSingleSegmentConversionExecutoralso - BaseMultipleSegmentsConversionExecutor handles segment METADATA upload, the same can be extended in
BaseSingleSegmentConversionExecutorfor METADATA upload
Please feel free to correct me if wrong.
@heatclub Ideally we don't need to upload anything when the segment remains exactly the same. I guess the proposal is to add an API to only modify some custom fields in the segment ZK metadata
Thanks @Jackie-Jiang , I was able to go through the code and get that part. Let me share the PR once its ready.
Hi @Jackie-Jiang, I see that there hasn't been any update on this ticket for a while. Can I look into this if @heatclub isn't working on it?
Hey @ceekay47 please feel free to pick this up
Hey, @ceekay47 can I look into this if you aren't working on it? Also, @Jackie-Jiang is this issue still needed?
Yes, the issue still applies. We do have a workaround by calling the ZK write rest API, but having a dedicate API and perform some validation is preferred