trueblocks-core
trueblocks-core copied to clipboard
chifra blocks - caching is slower than not caching
Proof:
chifra blocks 18000000-18000100:1 --decache >x
Clean out the cache of these blocks.
time chifra blocks 18000000-18000100:1 --cache >x
Executed in 2.44 secs fish external
Run a regular query for some blocks and cache them.
time chifra blocks 18000000-18000100:1 --cache >x
Executed in 191.55 secs fish external
Run the same exact query. 90 times slower. (Not a coincidence that 90 is about how many transactions are in an average block at 18,000,000.)
I know why this happens. When we write blocks to the cache, we only write the hashes of the transactions. Two reasons: (1) much smaller, (2) much faster.
This is the reason why chifra blocks has the --cache_txs option.
When we read from the cache, we read the transactions' hashes, but we've not cached them yet, so we must individually query the node for each individual transactions.
In the first case (not in cache) we query the node only once per block.
In the later case (already in cache) we query 100s of times per block because we're querying all the individual transactions.
Upshot: Don't read from block cache unless the user queries with --hashes.
Which begs the question -- why is there a block cache?
Alternatives:
- If we query
chifra blocks --hashes cache, save the blocks with a hash. - If we query
chifra blocks --cachesave the blocks with the full transactional detail. - Allow for reading an Optional from the block cache
- If we read a block whose transactions are only stored as hashes, "bump them up" if the user is querying without
--hashes. - If we write transactions into the block hash, perhaps we can write a "pseudo-cache-item" to the transactions cache noteing that the transaction is already cached, it's just in the block cache.
- Alternatively, we could double write the transaction once in the block cache and once in the transactions cache (but this is why we did the
hashesonly thing in the block cache to begin with -- it was too slow. - If we did the "write the hash if that's all we have, bump it up to a full transaction if we ever get it, and possibly duplicate it in the transactions hash," we've created a mess.