pinot icon indicating copy to clipboard operation
pinot copied to clipboard

[Question] whether refreshing/ingesting segments will cause it to rebalance?

Open xuhui-stripe opened this issue 2 years ago • 4 comments

Hello! We are doing some work to make changes for our tables to utilize "pool based" assignment strategy.

A question that I encountered is do we need to rebalance the table after applying the updated table configs (with pool based configs)? Because we do a fresh ingestion of data to the table daily, I just want to check with you to see if the ingestion that happens daily will pick up the updated table config, and rebalance the data during ingestion?

The reason I am asking this is, I wonder if it is required to do a rebalance before ingesting data to the table that has been updated with the pool based config. (Like do we have to do it before an ingestion happen? Or if a rebalance is required, it could happen at anytime regardless the ingestion.)

Thanks!

xuhui-stripe avatar Aug 22 '22 18:08 xuhui-stripe

Basically the question is shorten to what if a rebalance is not performed then a data ingestion happens for a table~ Sorry for tagging directly, I feel like this is a quick question~ Thanks for the help. cc @Jackie-Jiang

xuhui-stripe avatar Aug 22 '22 22:08 xuhui-stripe

Yes, you will need to rebalance the table in order to pick up the changes. For segment refresh, without rebalance it will automatically assign the refreshed segment to the same server for the atomic swap.

Jackie-Jiang avatar Aug 23 '22 23:08 Jackie-Jiang

Thanks for answering, does it mean it is okay to have for example 1 or 2 segment ingestions/refreshes before actually rebalancing the table? I did a quick test in our QA cluster. If I do an ingestion before rebalancing, the segment-server assignment distribution will not be changed. The distribution will be changed once I hit rebalance.

xuhui-stripe avatar Aug 24 '22 09:08 xuhui-stripe

That is fine. The ingestion (refresh) won't change the segment assignment, only the rebalance can. Conceptually you can run both of them at the same time (not recommended, but should work)

Jackie-Jiang avatar Aug 24 '22 18:08 Jackie-Jiang

Looks like this has been addressed. Closing.

npawar avatar Sep 29 '22 23:09 npawar