zebra icon indicating copy to clipboard operation
zebra copied to clipboard

Reduce Zebra disk usage for mining pools

Open teor2345 opened this issue 1 year ago • 0 comments

Motivation

Some mining pools have asked us to reduce Zebra's disk usage.

Alternative Designs

Here are some different things we could try, in rough order of effort/disruptiveness:

  1. Stop storing duplicate state data

    • #4784
    • Change the finalized sprout_note_commitment_tree to a key of () and a value of sprout::Root. Look up the actual note commitment tree in sprout_anchors.
  2. Improve database compression using:

    • a different level 0 compression algorithm, like zstd
    • the maximum compression rate
    • this probably doesn't need a state version change, but:
      • old states will have less compression, and
      • old versions of Zebra might not be able to open new states, if they don't have all the algorithms we're using

We might want to delay this work until after the audit, because it could change a lot of code:

  1. Add a config to Zebra that doesn't create unused indexes:

    • delete balance_by_transparent_addr
    • delete tx_loc_by_transparent_addr_loc
    • delete utxo_loc_by_transparent_addr_loc
    • delete sprout_note_commitment_tree lower than the finalized tip
    • delete sapling_note_commitment_tree lower than the finalized tip
    • delete orchard_note_commitment_tree lower than the finalized tip
    • delete history_tree lower than the finalized tip
    • This will cause errors in RPCs that use these indexes, but that's ok if they aren't called
  2. Add a config to Zebra that deletes blocks below finalized tip - how far we look back to check for legacy chains:

    • block_header_by_height
    • tx_by_loc
    • maybe hash_by_height
    • maybe height_by_hash
    • maybe hash_by_tx_loc
    • maybe tx_loc_by_hash
    • This could cause a lot of errors, we should try a quick and dirty implementation first

teor2345 avatar Nov 25 '22 02:11 teor2345