[test] bplus index inefficient warmup
2 bugs inside and outside bpsTree.traverse:
- depth of the tree was always 0 which created flat PARTIALLY ordered list of keys. Also 2 useless keys (first and second) were permanently added into cache.
- bug in iteration caused less node to be inserted in cache, so we did more random reads to files.
If set cacheEvery to 4 or 8 it would take each x element into cache. if =2 then ~0.8% of keys are in cache. cacheEvery=8 keeps ~3.3% of keys in memory and slightly decreases amount of cow and splits in db, and i think this trade of doesnt worth it.
Feels like having several layers of lists in cache does not increase speed but increases logic complexity
amount of cow and split reduced by 15%, spills and wops goes to 0 after update but increases amount of pagefaults i during bs seek.
This change does not require creation of new indices, but seriously affects structure of in-mem cache. Causes less random reads as well.
need remove printf
i don't understand how to read your pictures. what is on the left and what is on the right? (if left - before your fix, and right - after. then after fix commit gets lower and amount of pageFaults increased? so we getting worse?)
please double-check that your pictures are not affected by https://github.com/ledgerwatch/erigon/pull/10869
I did try main vs this branch (cacheEvery=2) vs this branch (cacheEvery=8) - print_stages:
alloc=13.9GB vs alloc=23.4GB vs alloc=23.4GB vs alloc=51.0GB
(bor-mainnet)
cacheEvery=8 - startup time is very slow
cacheEvery needs comment about max/min possible value. (like can we set 32? 1?)
i suppose to use cacheEvery=2 as we thought we use before. Going to distinct if not affected by #10869 today.
@awskii do you have link to board? i wanna take a look at "txs/sec" over big "rate interval"
also i don't understand - how database commit (not commitment) speed can depend on .bt keys cache.
confirmed that whole perf test here is screwed and related to #10869 and other recent updates to `main. Clean test later on.
I added last-mile optimisation: if R-L > 8 - scan one by one. Picked M for now is 256: optimal usage of memory and access time.
i broke a prune yet still it's good solution. will fix in the morning.
N: 12000000, M: 256 skip since shard <= 4
BenchmarkBpsTreeSeek-12 1236522 986.5 ns/op 883 B/op 12 allocs/op
N: 12000000, M: 256 skip since shard <= 64
BenchmarkBpsTreeSeek-12 426034 2708 ns/op 2577 B/op 38 allocs/op
N: 12000000, M: 256 skip since shard <= 16
BenchmarkBpsTreeSeek-12 805736 1277 ns/op 1154 B/op 16 allocs/op
N: 12000000, M: 256 skip since shard <= 1 (means do not scan even on small ranges)
BenchmarkBpsTreeSeek-12 1214692 967.4 ns/op 866 B/op 12 allocs/op
PASS
well, usefulness of that skip feature is not that clear but it makes less random reads which may be helpful as well. I set constant DefaultBtreeStartSkip=4 for now.
for NVMe disks change is barely noticable but for slower disks difference could be bigger. Flatten structure of initial array of cached nodes makes 30% less comparisons than levelled structure does.