erigon icon indicating copy to clipboard operation
erigon copied to clipboard

[wip] E35 replace values in commitment

Open awskii opened this issue 1 year ago • 3 comments

During Domain mergeFiles we call ValueTransform function for commitment branch values to replace encoded account and storage keys with references.

References v1

Consists of 2bytes which encodes three lengths of three uint64 encoded numbers (each number takes 1-8 byte to represent and we can encode number 0-7 into 3 bits each, just to avoid overflowing added 1 byte to encoding) in following format: {lengths}{startTxNum}{endTxNum}{offset}. Minimal key would take 5 bytes while largest could be even bigger than account key (26 bytes!) which makes replacement inefficient for accounts.

I wrote it like this to avoid additional division and multiplication on each encode/decode and to each key specifically points to some offset in exact file. To reduce reference length we could employ following approaches: - store startTxNum + delta instead of endTxNum which is always > startTxNum so reduce second uint64 encoding up to len(enc(strartTx))/2 - instead txNum store step which brings multiplication and division on stepSize to match given files list for each shortened key. Since this feature already trades CPU time for Storage could be useful as well

References v2

Encodes offset in file into 1-8 bytes. v1-commitment.96-112.kv has replaced keys in {account,storage}.96-112.kv OR plain keys. No other references allowed, so we can safely delete files and be sure that replacements are in similar step files.

Aggregator have new field commitmentValuesTransform which defines if value transformation is allowed. Allowed transformation means that

  • during each merge of files we locking and unlocking files from deletion during Domains.reCalcRoFiles on open/close/integrateFiles/integrateMergedFiles.
  • domain which has replaceKeysInValues calls valueTransform during merge.
  • during read of Commitment domain with enabled replaceKeysInValues we do backward replacement to full keys to allow HPH work as before

This feature is domain[Commitment] specific because there we understand what we can replace correctly, while for others it does not make sense. So current implementation is not extendable yet to all domains (at least need back-transform during read)

awskii avatar Feb 28 '24 16:02 awskii

Did run this branch on sepolia + small step size. Did several kill -9 in the middle of merge. Got:

[WARN] [03-12|07:49:46.507] [7/15 Execution] Execution failed        block=2250026 txNum=7925439 hash=0xc3b148ce2edf9f8f18158c1644e29eff7212f6ebd4dc2a1e47103f60e1f1fabe err="invalid block, gas used by execution: 247037, in header: 253737, headerNum=2250026, c3b148ce2edf9f8f18158c1644e29eff7212f6ebd4dc2a1e47103f60e1f1fabe

my guess: DomainDelPrefix doesn't expect to see overlaps in context files

do you see any option to not have overlaps in roFiles? (it maybe will need to be supported in al iterators).

AskAlexSharov avatar Mar 12 '24 07:03 AskAlexSharov

added integration run_migrations --squeeze to run file migration on existing files. Tested few times following scenario:

  • execute some blocks (up to 100+ steps) without replacement
  • did integration run_migrations --squeeze --datadir <dir> to perform commitment file "squeezing"
  • continue execution with integration stage_exec --datadir <dir>

Seems like no problems so far.

commitment files before/after value replacements:

image

awskii avatar Mar 28 '24 20:03 awskii