iavl
iavl copied to clipboard
EPIC: Node & key format
Currently IAVL does not take advantage of data locality on disk. The way IAVL store nodes on disk is with a hash that gets stored in a random location on disk and makes it so that IAVL needs to do a random search of disk in order to find the data. With the recent work from osmosis on fast node we saw a large performance improvement with a trade off on writes.
Secondly, there are two key formats, live keys and orphan keys referenced by node hash which also indexes the orphan nodes.
A next step would be to modify IAVL to structure data on disk in a logical manner to reduce random searches and reduce random compaction of disk data.
There are two things that can be done in order to take advantage of this.
Work Scope:
Phase 1:
- [ ] modify the key format from a hash of the data to be
write_version|node_path_in_tree.
Phase 2:
- [ ] make every node store the version of its children.
- [ ] remove orphan indexing and hashes
(Proposed by @ValarDragon)
Other steps:
- To modify the key format, you have to update the node struct to add
left_child_versionandright_child_versionin the nodes - Probably delete the root key format, and instead just make it a method that returns a key in your main node format space
Can be done later, but needed pre-integration for pruning:
- Clarify versioning / historic delete API semantics from SDK
- Make methods to prune (theres a couple approaches, if someones having trouble deriving them, can write up more)
modify the key format from a hash of the data to be write_version|node_path_in_tree.
where write_version is the left side and node_path_in_tree is the right side?
Or is it like, both sides have that?
NOTE: store/v2 solves that in a way similar to TurboGeth (erigon)
So this proposal will increase a node (and DB) size, right?
Few important questions
- What is the definition of
write_version? - Will that impact state proofs / merkle proofs? This change must be compatible with the existing merkle proof, otherwise we have trouble with IBC. My guess is that it won't, but want to double check.
NOTE:
store/v2solves that in a way similar to TurboGeth (erigon)
thank you, but as mentioned in a few places store/v2 work is unknown at this time for countless reasons.
@marbar3778
It looks like we couldn't modify the key format as write_version|node_path_in_tree. Because when rotating the tree, the path would be changed.
this is expected, do you see a problem with this?
cc @ValarDragon
@marbar3778, I'm also working on this, can you assign me as well