osmosis
osmosis copied to clipboard
Archive Node Stuck on ValidatorDelegations Query
Background
Our Archive node that is open to public traffic was halting regularly. When looking into the pprofs, it seemed like it was getting caught every time on a specific query, which was the ValidatorDelegations query. When looking at how the query was paginated, it seems like there is a difference between sdk 47 and sdk 50 versions of this query. Here are the pprofs:
CPU1: https://drive.google.com/file/d/1IYCqp-pd-p644mEmjunYaWkVkVvXMd7v/view?usp=sharing Heap1: https://drive.google.com/file/d/1CvYNFPkG5nSLGZYNyqXO9Dyonf_JLUjH/view?usp=sharing
CPU2: https://drive.google.com/file/d/1GhNPJYyCOMewjc2gQ-TbIADIOvmWedAD/view?usp=sharing Heap2: https://drive.google.com/file/d/1vMWtG2dxKOOxN9I3XqoVMVz0v_X9Dx3_/view?usp=sharing
When closing the node to public traffic, the node has not halted since.
Suggested Design
I think we should attempt to use the sdk 50 version of the ValidatorDelegations query (specifically the pagination logic) to see if that helps. If it doesn't, further investigation will need to be made on why this query causes the archive node to stop.