feat: improved shard cache
Improve shard cache to use RAM more effectively.
Three changes are introduced:
-
If we put new value to LRU cache and total size of existing values exceeds
total_sizes_capacity, we evict values from it until that is no longer the case. So the actual total size should never exceedtotal_size_limit+TRIE_LIMIT_CACHED_VALUE_SIZE. We add this because value sizes generally vary from 1 B to 500 B and we want to count cache size precisely. The current value size limit is 1000 B, so for average size of 100 B we use shard cache 10x more effectively. -
When we save trie changes, we previously just applied insertions to the shard cache - which means that we added newly created nodes to it. Deletions were applied only during GC of the old block. Now we apply deletions and call
popfor shard cache during saving trie changes of a new block as well. This helps to use shard cache space more effectively. Previously nodes from the old state could occupy a lot of space which led to eviction of nodes from the fresh state. -
If shard cache
popis called, item is not deleted but put to thedeletionsqueue withdeletions_queue_capacityfirst. If popped item doesn't fit in the queue, the last item is removed from the queue and LRU cache, and newly popped item is inserted to the queue. It is needed to delay removals when we have forks. In simple case, two blocks may share a parent P. When we process the first block, we callpopfor some nodes from P, but when we process the second block, we may need to read some nodes from P as well. Now we delay removal by 100_000, which helps to keep all nodes from 3 completely full last blocks.
Next steps:
- make new constants configurable, similarly to trie cache capacity;
- add new metrics to prometheus similarly to https://github.com/near/nearcore/pull/7439.
We want to get the whole update merged by next Wednesday, and cherry-pick it to 1.28 and 1.29 releases. This is not a protocol change, so it doesn't require a separate release or protocol version.
Testing
- Tests for
BoundedQueuewhich keep the queue of trie deletions - Tests for
TrieCachewhich logic is less trivial now