erigon icon indicating copy to clipboard operation
erigon copied to clipboard

Erigon 3: Purified states repressentation

Open Giulio2002 opened this issue 1 year ago • 4 comments

Purifification of states works by pruning historical states from dangling nodes.

The process is to collect all keys into a temporary MDBX with mapping node -> layer , then we iterate over each node again and remove all of them whose node->layer is not matching in the historical states DB.

Giulio2002 avatar Dec 24 '24 12:12 Giulio2002

if we will do purification manually - we will loose it after next re-sync. for example now I re-syncing gnosis (exec all blocks from 0) to generate new ReceiptDomain (because there was a bug). Also during manual purification - you did 2 manual-mistakes. So, if it gives us important profit: then maybe need make it first-time-sitizen feature (with unit-tests, integrity checks, etc...)?

Yes this is correct unfortunately. If you regen, then it must be re-done again. I think now it works, I can just make it run on the datadir and remove the manual part. it is not that hard. regarding devops complaining - I think it is fine as I plan on having this done only once every 2 years. also - the integrity check is basically computing commitment which we already have

Giulio2002 avatar Dec 26 '24 10:12 Giulio2002

Update: now it is automatic. you run the command on the datadir with flag --replace-in-datadir and it will just do the automatic conversion with nothing manual

Giulio2002 avatar Dec 26 '24 11:12 Giulio2002

I plan on having this done only once every 2 years - main problem of such commands - nobody know that they are exist. for example: we have "skip jump-dest analysis feature" core/skip_analysis.go: there was CLI command and release.md doc with run this command step. Now i can't find release.md and can't find this command. Same - will happen with your command: no comments, you didn't add step to release.md, etc...

aha, while i wrote this comment: i accidentally found RELEASE_INSTRUCTIONS.md and there is section state checkChangeSets. aha, few months ago i removed state checkChangeSets command because it's was not compatible with E3 and we have erigon snapshots integrity command. and now i found that this command is part of release process - and nobody complained about it.

:-)

AskAlexSharov avatar Dec 27 '24 03:12 AskAlexSharov

I can do some tweaks so that it can be added as part of the automation.

the main modification will be adding a check to L0 and see the last date of modification. if it is >3 months then we purify. how does that sounds? naturally - if we regen snapshots, then we need to force it through. another approach (perhaps better) maybe is to require a minimum skip ratio for L0... say >10%.

Also forgot to specify benefits:

ethmainnet: -73 GB
polygon: -147 GB

Giulio2002 avatar Dec 27 '24 11:12 Giulio2002

Putting here your questions and my answers from discord

  • do you see any way to make it more determenistic?

Not really.

  • do you see any way to make it less human-mistake-proof?

I made it so - it is fully automatic and you just need to purify.

  • Which domains do you plan to purify? only commitment is enough? Or all which pass min-skip-ratio-l0? only L0? we always purify only L0?

We purify L0,L1,L2,... L0 has biggest benefit and commitment is enough.

  • need add step to release.md or we will forget purify command.

Yes, I did that

  • this line feels a bit confusing StringVar(&purifyDir, "purifiedDomain", "purified-output", "")

that is the output dir which is only relevant with --replace-in-datadir=false

  • If we release purified domains - we will not have backup (almost) of non-purified .kv files. in case we need rollback. Let's expicitrly backup our R2 buckets before purified files release - and maybe put some meaningful label on this backup (devopses have this button).

I think this is not so important as functionally they are the same.

  • then i will try run purify command on ethmainnet snapshotter.

Did it for you

Giulio2002 avatar Jan 04 '25 21:01 Giulio2002

“ Did it for you” a- i didn’t try. Left it for you.

AskAlexSharov avatar Jan 06 '25 11:01 AskAlexSharov