pinot
pinot copied to clipboard
Rebalancing an upsert table causing high GC and failure to reconnect to ZK
Follow up from https://github.com/apache/helix/issues/2951 which provides more detail.
We performed a rebalance on an upsert table using low-disk mode that led to high GC on a server and the server constantly trying to reconnect to ZK. The server never recovers until we manually restart it.
@Jackie-Jiang had a theory this might be tied to the metadata manager for old partitions not getting released even after the segments were all dropped and thus there's a large empty concurrent hash map still on heap causing GC.