zebra icon indicating copy to clipboard operation
zebra copied to clipboard

Improve VerifyBlockError::Commit typing, so we don't accidentally break syncer error handling

Open conradoplg opened this issue 2 years ago • 7 comments

Motivation

VerifyBlockError::Commit is a wrapper over a BoxError. However, in https://github.com/ZcashFoundation/zebra/pull/2890/ a check was added in the should_restart_sync function to ignore Commit errors corresponding to "block is already committed to the state" when deciding whether to reset the sync procedure. That check is fragile since it will break if the string changes.

Change Commit to wrap a specific error type enumeration, and create a specific item for that particular error to be used for matching and filtering it in should_restart_sync.

Currently, this check is implemented by the is_duplicate_request() method, which can't use Commit because it is a BoxError.

There's also a BlockDownloadVerifyError::Invalid error that comes from the chain verifier and a BlockDownloadVerifyError::DownloadFailed from the network service.

We should downcast all BoxErrors in the syncer's BlockDownloadVerifyError to concrete types and add them to the match statement in should_restart_sync():

https://github.com/ZcashFoundation/zebra/blob/0152c8641bc41767a5ef7a67a8e01e84a26eb0a7/zebrad/src/components/sync.rs#L1088

https://github.com/ZcashFoundation/zebra/blob/0152c8641bc41767a5ef7a67a8e01e84a26eb0a7/zebrad/src/components/sync.rs#L1155-L1163

Specifications

Designs

  • See #5682
  • See #5757
    • we don't need to reorder fields manually to save space, because Rust 1.67 will do this automatically: https://github.com/rust-lang/rust/pull/102750
    • we can still turn unnamed tuple fields into named fields, because that makes debugging easier

Related Work

  • #1186

Follow up to https://github.com/ZcashFoundation/zebra/pull/2890/

conradoplg avatar Oct 19 '21 18:10 conradoplg

Hey team! Please add your planning poker estimate with ZenHub @conradoplg @dconnolly @jvff @oxarbitrage @teor2345 @upbqdn

mpguerra avatar Nov 23 '21 10:11 mpguerra

We're de-prioritizing network changes, and network-related changes.

teor2345 avatar Aug 25 '22 23:08 teor2345

This is needed to do #5487

teor2345 avatar Nov 02 '22 23:11 teor2345

I added this as a TODO in the source code, so we can do it if we are rewriting that code for other reasons.

teor2345 avatar Dec 01 '22 01:12 teor2345

@arya2 just checking that this isn't an audit issue?

@mpguerra we've tried to do this a few times and it's been tricky due to Rust language limitations. Some of those might have been fixed recently, but I still think we should time-box this task to 1 day in total. (And maybe one or two reviews.)

This code is also hard to test, it's only really tested by our integration tests. (And not all our errors are covered by those tests.) So it will need careful review or extra unit tests to avoid introducing new panics.

teor2345 avatar Mar 26 '23 22:03 teor2345

We haven't scheduled this work yet and I think it can probably wait for discussion until our next gardening sync. I'll add it to the agenda.

mpguerra avatar Mar 27 '23 09:03 mpguerra

@arya2 just checking that this isn't an audit issue?

Nothing in this issue has been brought up in the audit so far.

arya2 avatar Mar 28 '23 00:03 arya2