m3 icon indicating copy to clipboard operation
m3 copied to clipboard

[dbnode] fsync before closing checksummed files

Open linasm opened this issue 5 years ago • 3 comments

What this PR does / why we need it: Adds fsync (File.Sync() call) before closing a checksummed file. This is needed to maintain transactional semantics by making fileset files durable before the checkpoint file gets written.

Special notes for your reviewer: My rationale of adding this code to fdWithDigestWriter.Close() is that that normally fdWithDigestWriters are participating in a fileset transaction and should be made durable at the moment of closing them. An alternative approach could be to make those Sync() calls from writer code separately (before invoking fdWithDigestWriter.Close()), but then it would be easy to forget a Sync() on some of the files.

Does this PR introduce a user-facing and/or backwards incompatible change?: NONE

Does this PR require updating code package or user-facing documentation?: NONE

linasm avatar Oct 22 '20 07:10 linasm

Codecov Report

Merging #2788 (4259c0d) into master (82b3429) will decrease coverage by 0.2%. Report is 411 commits behind head on master. The diff coverage is 40.8%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #2788     +/-   ##
========================================
- Coverage    56.3%   56.2%   -0.2%     
========================================
  Files         551     551             
  Lines       61872   61912     +40     
========================================
- Hits        34889   34849     -40     
- Misses      23864   23930     +66     
- Partials     3119    3133     +14     
Flag Coverage Δ
aggregator 57.2% <ø> (ø)
cluster ∅ <ø> (∅)
collector 58.4% <ø> (ø)
dbnode 60.8% <40.8%> (-0.2%) :arrow_down:
m3em 46.4% <ø> (ø)
metrics 19.8% <ø> (ø)
msg 74.1% <ø> (-0.3%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 82b3429...4259c0d. Read the comment docs.

codecov[bot] avatar Oct 22 '20 07:10 codecov[bot]

After our discussion:

  1. Made Sync() calls run in parallel to offset some of the IO latency (except for the checkpoint file).
  2. Made Sync() calls by the fs.writer opt-in (controlled by the new DataWriterOpenOptions.SyncBeforeClose option).
  3. Enabled SyncBeforeClose for the fs.streamingWriter.

@robskillington which other (if any) writer use cases should have the Sync() enabled starting from this PR?

linasm avatar Oct 26 '20 08:10 linasm

@arnikola I've addressed all of your suggestions, would appreciate if you could make one more pass on it, even if it's just from the code perspective.

linasm avatar Nov 09 '20 08:11 linasm