mina icon indicating copy to clipboard operation
mina copied to clipboard

Add CLI tool to fix persistent frontier's root discrepancy

Open georgeee opened this issue 3 months ago • 0 comments

Problem

Frontier persisting is fundamentally broken.

Persistent frontier is managed by an on-disk database coupled with a file containing the latest root hash. While the root file is updated immediately after the in-memory frontier switched from one root to another, storing of the transitions on disk (in the transitions database) comes with up to 9 blocks delay.

This setup allows for better efficiency, but when the Mina node starts, it detects that the root as present in the file is not equal to the root from the database, and treats the whole on-disk persistence as corrupt, removing frontier entirely. This discrepancy between root file and transition database can occur easily, e.g. on the Mina node exiting abruptly (unsure whether it's the case when node is finished with a stop command sent via client).

Solution

Tool that is introduced in this PR allows to fix the state of the persistent frontier's database by removing some transitions fromt he database until the root read from the file becomes a root of the database as well. Removing is performed by executing part of the regular root-moving routine.

In future we must consider embedding this recovery mechanism into Mina node itself, but as the first step it's good to be able to deliver it wrapped in a CLI tool.

Commit structure

PR contains a few straightforward commits exposing some functions across the codebase:

  • Expose stable's header in validated_block.ml
  • Add copy_dir function to stdlib
  • Expose get_root_hash from persistent frontier
  • Expose stable's transition fun from root data
  • Expose protocol_states_for_root_scan_state in transition frontier

With the final commit introducing the new tool:

  • Add mina advanced fix-persistent-frontier

Testing

Explain how you tested your changes:

  • [x] Test the tool on a frontier with 1-block discrepancy
  • [ ] Test the tool on a frontier with more than 1-block discrepancy
    • Confirm node is able to start after loading such frontier

Checklist

  • [x] Dependency versions are unchanged
    • Notify Velocity team if dependencies must change in CI
  • [x] Modified the current draft of release notes with details on what is completed or incomplete within this project
  • [x] Document code purpose, how to use it
    • Mention expected invariants, implicit constraints
  • [x] Tests were added for the new behavior
    • Document test purpose, significance of failures
    • Test names should reflect their purpose
  • [x] All tests pass (CI will check this if you didn't)
  • [x] Serialized types are in stable-versioned modules
  • [x] Does this close issues? None

georgeee avatar Nov 14 '25 20:11 georgeee