gno icon indicating copy to clipboard operation
gno copied to clipboard

feat: blocks backup / restore

Open n0izn0iz opened this issue 9 months ago • 20 comments

Fixes #1827

Description

  • Adds a grpc service called Backup in the tendermint2 node that allows to stream blocks efficiently It has a single method StreamBlocks that take a start and end height. If end height is 0 it will stream to the latest height. It is disabled by default and require enabling it in the config.toml
  • Adds a contribs binary named tm2backup that pulls blocks from the backup service and store them in compressed 100-blocks files. It takes a start and end height as well as supporting resuming. The tar format was chosen to bundle blocks since it's widely supported and efficient. The zstandard format was chosen for compression because it's fast, has a good compression ratio and is widely supported.
  • Adds a restore subcommand to the gnoland binary that allows to replay blocks from a backup. It takes the options from the start subcommand as well as the backup directory and an optional end height. It will start at the current node height + 1.

The restore command can only restore at backupEndHeight-1 because I did not figure a way to commit block n without block n+1. I'd gladly take ideas on how to do that.

The backup is fast enough for now IMO (< 20min for test5 on my macbook) but can be optimized because it's not parallelized. The restore bottleneck seems to be the gnovm currently but I would need to profile to be sure.

How to create a backup

  • Enable the backup service in your node's config.toml
    [backup]
    
    laddr = "localhost:4242"
    
  • (Re-)Start your node
  • Run the tm2backup command
    cd contribs/tm2backup
    tm2backup -o blocks-backup -remote http://localhost:4242
    
    Example output: Screenshot 2025-03-14 at 22 50 29

    (...) Screenshot 2025-03-14 at 22 51 27

How to create a node from a backup

  • Get the genesis file, for example:
    wget https://example.com/genesis.json
    
  • Run the restore command
    gnoland restore --lazy --backup-dir ../contribs/tm2backup/blocks-backup
    
  • Start your node
    gnoland start
    

TODO

  • [x] blocks streaming grpc service
  • [x] 100 blocks files generation command
  • [x] restore command
  • [ ] ~~find a way to restore up to backup height if possible~~
  • [x] tests

n0izn0iz avatar Mar 14 '25 20:03 n0izn0iz

🛠 PR Checks Summary

🔴 Pending initial approval by a review team member, or review from tech-staff

Manual Checks (for Reviewers):
  • [ ] IGNORE the bot requirements for this PR (force green CI check)
Read More

🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers.

✅ Automated Checks (for Contributors):

🟢 Maintainers must be able to edit this pull request (more info) 🔴 Pending initial approval by a review team member, or review from tech-staff

☑️ Contributor Actions:
  1. Fix any issues flagged by automated checks.
  2. Follow the Contributor Checklist to ensure your PR is ready for review.
    • Add new tests, or document why they are unnecessary.
    • Provide clear examples/screenshots, if necessary.
    • Update documentation, if required.
    • Ensure no breaking changes, or include BREAKING CHANGE notes.
    • Link related issues/PRs, where applicable.
☑️ Reviewer Actions:
  1. Complete manual checks for the PR, including the guidelines and additional checks if applicable.
📚 Resources:
Debug
Automated Checks
Maintainers must be able to edit this pull request (more info)

If

🟢 Condition met
└── 🟢 And
    ├── 🟢 The base branch matches this pattern: ^master$
    └── 🟢 The pull request was created from a fork (head branch repo: n0izn0iz/gno)

Then

🟢 Requirement satisfied
└── 🟢 Maintainer can modify this pull request

Pending initial approval by a review team member, or review from tech-staff

If

🟢 Condition met
└── 🟢 And
    ├── 🟢 The base branch matches this pattern: ^master$
    └── 🟢 Not (🔴 Pull request author is a member of the team: tech-staff)

Then

🔴 Requirement not satisfied
└── 🔴 If
    ├── 🔴 Condition
    │   └── 🔴 Or
    │       ├── 🔴 At least one of these user(s) reviewed the pull request: [jefft0 leohhhn n0izn0iz notJoon omarsy x1unix] (with state "APPROVED")
    │       ├── 🔴 At least 1 user(s) of the team tech-staff reviewed pull request
    │       └── 🔴 This pull request is a draft
    └── 🔴 Else
        └── 🔴 And
            ├── 🟢 This label is applied to pull request: review/triage-pending
            └── 🔴 On no pull request

Manual Checks
**IGNORE** the bot requirements for this PR (force green CI check)

If

🟢 Condition met
└── 🟢 On every pull request

Can be checked by

  • Any user with comment edit permission

Gno2D2 avatar Mar 14 '25 20:03 Gno2D2

seems codecov is not taking into account the reactor coverage done by the restore test also it ignores the tm2/backup code (not marking it as change), tell me how you want me to proceed with this, IMO the actual coverage is good

n0izn0iz avatar Apr 07 '25 13:04 n0izn0iz

seems codecov is not taking into account the reactor coverage done by the restore test

Can you make a small unit test - while also making a comment that a large quantity of tests are actually in tm2backup (or whatever they are)?

Code may get re-organised over time or copied into other repos - so it's good to have some testing next to the functions to test, even though there's other good usage being tested elsewhere

thehowl avatar Apr 17 '25 08:04 thehowl

Hi @n0izn0iz . Do you have a response to @thehowl about codecov?

jefft0 avatar Apr 29 '25 11:04 jefft0

I'm on it, sorry I forgot to address it

n0izn0iz avatar Apr 29 '25 12:04 n0izn0iz

done, not sure why I get "crossing" errors since I did not update the branch, seems some tests are pulling master branch or something

n0izn0iz avatar Apr 29 '25 13:04 n0izn0iz

We can merge master after the CI checks are fixed in PR https://github.com/gnolang/gno/pull/4228 .

jefft0 avatar Apr 29 '25 14:04 jefft0

Hi @n0izn0iz . Your branch now includes the latest CI fixes in master. But there are still many failing CI checks in this PR. Are the because of your changes?

jefft0 avatar May 13 '25 14:05 jefft0

probably, I need to look into it, I'll do it today

n0izn0iz avatar May 13 '25 17:05 n0izn0iz

ready for review again

n0izn0iz avatar May 17 '25 20:05 n0izn0iz

Hi @n0izn0iz . Is the failed CI check "Portal Loop" related to this PR?

jefft0 avatar May 21 '25 15:05 jefft0

@jefft0 it was present on master, hopefully rebasing will pass it

n0izn0iz avatar May 21 '25 16:05 n0izn0iz

yep, that did the trick

n0izn0iz avatar May 21 '25 16:05 n0izn0iz

Hi @n0izn0iz . A recent commit in the master branch fixed many CI checks. Can you try "Update branch" again?

jefft0 avatar May 28 '25 14:05 jefft0

@jefft0 done

n0izn0iz avatar Jun 11 '25 11:06 n0izn0iz

Hi @n0izn0iz . The CI checks in master are fixed now. Do you want to merge master and run the tests again?

jefft0 avatar Jul 24 '25 08:07 jefft0

it was up to date and the failing check is a flaky one I think. the gnokms merge from yesterday introduced conflicts that I will fix asap

n0izn0iz avatar Jul 24 '25 09:07 n0izn0iz