nearcore
nearcore copied to clipboard
fix: Handling of persisting errors in on_chunk_completed.
Currently on_chunk_completed
is called when a chunk is to completed. It persists the chunk and this may fail. This change is addressing a todo for more graceful handling of the error (#10569).
ATTENTION TO REVIEWERS: Sending this PR to start a discussion around the right handling of the error. Is this change that simple or am I missing something? for example, if the persisting the chunk fails, what else we need to do (or revert)?
Codecov Report
Attention: Patch coverage is 46.15385%
with 7 lines
in your changes are missing coverage. Please review.
Project coverage is 71.64%. Comparing base (
92e5938
) to head (d30d2c7
). Report is 4 commits behind head on master.
Files | Patch % | Lines |
---|---|---|
chain/client/src/client.rs | 46.15% | 6 Missing and 1 partial :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #10854 +/- ##
==========================================
+ Coverage 71.61% 71.64% +0.02%
==========================================
Files 758 758
Lines 151765 151924 +159
Branches 151765 151924 +159
==========================================
+ Hits 108692 108840 +148
- Misses 38537 38544 +7
- Partials 4536 4540 +4
Flag | Coverage Δ | |
---|---|---|
backward-compatibility | 0.24% <0.00%> (-0.01%) |
:arrow_down: |
db-migration | 0.24% <0.00%> (-0.01%) |
:arrow_down: |
genesis-check | 1.42% <0.00%> (-0.01%) |
:arrow_down: |
integration-tests | 37.37% <46.15%> (-0.02%) |
:arrow_down: |
linux | 70.26% <46.15%> (+0.02%) |
:arrow_up: |
linux-nightly | 71.15% <46.15%> (+0.06%) |
:arrow_up: |
macos | 54.74% <46.15%> (+0.02%) |
:arrow_up: |
pytests | 1.65% <0.00%> (-0.01%) |
:arrow_down: |
sanity-checks | 1.43% <0.00%> (-0.01%) |
:arrow_down: |
unittests | 67.30% <46.15%> (+0.03%) |
:arrow_up: |
upgradability | 0.29% <0.00%> (-0.01%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I no longer remember the context of what we should do if this case errors. But thinking about it now, (1) I don't think the chain can continue if this persist fails, because we rely on the existence of the partial chunks in order to execute any chunk; (2) there's no reason this would fail because it is simply a store write, and the only way a store write would fail is if there's something just wrong with the db (IO error, disk full, etc.) in which case there is likely a catastrophic failure.
So, I think
.expect
is actually OK. @staffik do you have additional context here since you filed the issue?
I do not have additional context, just filled this issue because of this suggestion: https://github.com/near/nearcore/pull/10568#discussion_r1478384630
If the only way a store write would fail is if there's something wrong with the db then I would leave .expect
here too.
Based on Robin's comment, I am now also more comfortable with leaving the .expect
call as is since handling of IO failure correctly may be more challenging or impossible. Adding @wacban to see if we miss anything.