pinot icon indicating copy to clipboard operation
pinot copied to clipboard

Rebalancing an upsert table causing high GC and failure to reconnect to ZK

Open dang-stripe opened this issue 1 year ago • 0 comments

Follow up from https://github.com/apache/helix/issues/2951 which provides more detail.

We performed a rebalance on an upsert table using low-disk mode that led to high GC on a server and the server constantly trying to reconnect to ZK. The server never recovers until we manually restart it.

@Jackie-Jiang had a theory this might be tied to the metadata manager for old partitions not getting released even after the segments were all dropped and thus there's a large empty concurrent hash map still on heap causing GC.

dang-stripe avatar Oct 24 '24 22:10 dang-stripe