solana
solana copied to clipboard
Support zstd genesis archives
Taking over https://github.com/solana-labs/solana/pull/33614 from @ripatel-fd - we exchanged some DM's and agreed I could take this off his hands to drive to completion.
Problem
- BZip2 is deprecated in most of the Solana protocol.
- Genesis archives are one of the few places where BZip2 is still the main compression algorithm.
- solana-test-validator creates a tar subprocess when creating new ledgers.
Summary of Changes
- Allows the validator to discover and load genesis.tar.zst archives (continues to support genesis.tar.bz2)
- Uses the tar crate instead of a subprocess to create new genesis archives.
- Uses genesis.tar.zst when creating new ledgers
The force pushes were to rebase to tip of master in order to run against latest changes
do we intend to migrate the public clusters to zstd genesis? if so, how are we going to bootstrap?
do we intend to migrate the public clusters to zstd genesis? if so, how are we going to bootstrap?
I don't think we have to necessarily; any reason not to continue to support both in tandem? If we did want to fully migrate to zstd / phase out bz2, I think that'd look something like:
- Get mnb to a version where client can serve
zstd(whichever branch this lands in, call it v1.18) - Make a change for clients to start requesting zstd (v1.19)
- Stop serving bz2 in v1.20
I'm not sure we'd want to change this value tho at the risk of breaking things outside of our codebase that might have it hardcoded: https://github.com/solana-labs/solana/blob/6a5b8e86f3c492a4984e780591cc97a027a59a8a/sdk/src/genesis_config.rs#L38
CI failed with localnet. Namely, the non-bootstrap node failed to get genesis:
[2023-11-14T23:34:37.828034903Z WARN solana_validator::bootstrap]
Failed to load genesis config: Unable to open "/solana/config/validator/genesis.bin": Os { code: 2, kind: NotFound, message: "No such file or directory" }
It is requesting the old archive as noted here: https://github.com/solana-labs/solana/blob/5658d6ee5bcf132b60f94857624f92cb2239706e/genesis-utils/src/lib.rs#L53-L62
On the bootstrap node, I can see a log that it attempted to serve the old file format but that didn't yield an actual transfer:
[2023-11-14T23:34:37.823021949Z INFO solana_rpc::rpc_service]
get /genesis.tar.bz2 -> "/solana/config/bootstrap-validator/genesis.tar.bz2" (0 bytes)
I thought there was some code somewhere that created the archive if you didn't have it; either I'm misremembering or that code is not getting hit for some reason. Will dig in further
I thought there was some code somewhere that created the archive if you didn't have it; either I'm misremembering or that code is not getting hit for some reason. Will dig in further
don't think this is the case. only place i see a bz2 encoder is snapshot and bigtable
don't think this is the case. only place i see a bz2 encoder is snapshot and bigtable
Also took a look and I believe you're correct; skimming the logic, it looks like a node run WITHOUT --no-genesis-fetch will still download the archive, even if it has a genesis.bin locally. That might be what I'm remembering, and we should probably adjust the logic to check for genesis.bin OR genesis archive before issuing a request for another one (seemingly a separate PR for that).
That presents two problems:
- How do we know which format to request?
- Given that my previous assumption was incorrect, your comment about how we bootstrap this is relevant again.
For 1., I think we could have simple logic to request one first (zstd), and if that doesn't yield a file, then fall back to bz2.
For 2., I think we could have nodes that are getting run with RPC enabled create the genesis.tar.zstd archive at startup so it has both.
Thoughts ?
i think to keep this pr small, let's just support reading zstd compressed genesis archives. we can add generation to solana-genesis in a followup. then worry about the if/how to migrate existing bz2 clusters afterwards
i think to keep this pr small, let's just support reading zstd compressed genesis archives. we can add generation to
solana-genesisin a followup. then worry about the if/how to migrate existing bz2 clusters afterwards
Works for me
I'll reopen this in Agave; has been on the backburner for quite a while now but should be quick to push it through