pinot
pinot copied to clipboard
[Question] whether refreshing/ingesting segments will cause it to rebalance?
Hello! We are doing some work to make changes for our tables to utilize "pool based" assignment strategy.
A question that I encountered is do we need to rebalance the table after applying the updated table configs (with pool based configs)? Because we do a fresh ingestion of data to the table daily, I just want to check with you to see if the ingestion that happens daily will pick up the updated table config, and rebalance the data during ingestion?
The reason I am asking this is, I wonder if it is required to do a rebalance before ingesting data to the table that has been updated with the pool based config. (Like do we have to do it before an ingestion happen? Or if a rebalance is required, it could happen at anytime regardless the ingestion.)
Thanks!
Basically the question is shorten to what if a rebalance is not performed then a data ingestion happens for a table~ Sorry for tagging directly, I feel like this is a quick question~ Thanks for the help. cc @Jackie-Jiang
Yes, you will need to rebalance the table in order to pick up the changes. For segment refresh, without rebalance it will automatically assign the refreshed segment to the same server for the atomic swap.
Thanks for answering, does it mean it is okay to have for example 1 or 2 segment ingestions/refreshes before actually rebalancing the table? I did a quick test in our QA cluster. If I do an ingestion before rebalancing, the segment-server assignment distribution will not be changed. The distribution will be changed once I hit rebalance.
That is fine. The ingestion (refresh) won't change the segment assignment, only the rebalance can. Conceptually you can run both of them at the same time (not recommended, but should work)
Looks like this has been addressed. Closing.