confy icon indicating copy to clipboard operation
confy copied to clipboard

Doesn't atomically save to file, resulting in possible file truncation or corruption

Open nyanpasu64 opened this issue 3 years ago • 6 comments

I've noticed that when confy tries to write to a file path, it overwrites the file in-place:

https://github.com/rust-cli/confy/blob/cd6932214ec16d50aa0f0defeb847a2e12618e61/src/lib.rs#L281-L287

In my experience (though not with apps using confy), overwriting files in-place causes problems. One possible issue (for files shared between many apps) is that either other applications will be able to read half-written contents, or the file will be locked and unreadable for the duration of the write. A more serious integrity issue is that if the application terminates during the file write (either from panicking, or from being terminated in taskmgr or Ctrl-C'd), it will result in a corrupted file being written to disk, with both the old and new file data being lost.

As an example of what can go wrong, a Python app I wrote (corrscope) saves config files directly to disk (since I didn't know better at the time), overwriting the previous file. Every few months I get a report from a user who can't open the app because the config file is corrupt and fails to load.

What causes saving to fail?

Confy's store_path() first opens the file, then calls either toml::to_string_pretty or serde_yaml::to_string (both of which are fallible and return Result<...>), and use the ? operator on the Result. If the function called fails, then store_path() exits, but the file on the disk has already been truncated and the damage is done. (I'm not sure if TOML or YAML serialization is likely to fail, either reliably or on edge cases, in unfinished programs on developer machines or in released applications on user machines.)

In corrscope, I suspect a similar event can occur if the serializer is left in a corrupted state. (I use a persistent global YAML serializer object for both documents and settings, since I hadn't learned Rust at the time, which would've taught me to avoid shared mutability.)

On top of this, there's the risk of getting terminated or Ctrl-C'd during the serialization operation (or if the serialization hangs and the app needs to be killed). I'm not sure if a system crash after the app finishes writing can lead to corruption, in various filesystems and journaling modes.

Solutions

A common strategy is to write to a temporary file and rename it on top of the original. While imperfect (may not always preserve permissions or owners), it's the least bad strategy I'm aware of for saving generic textual config files. (An alternative, advocated by danluu, is to not rename on top of the original file, but use an external log file for crash recovery. However, this introduces complexity in the app, and is unworkable if other apps expect plaintext config files and don't understand how to recover from log files.)

Additionally if you want to recover from system crashes as well, you need to fsync the temporary file before renaming it to the original filename. If you don't do so, it's possible that the temporary file gets renamed and overwrites the original file (and the rename gets flushed to disk), but the system crashes before the temporary file's contents are written to disk.

Yet another level of protection is fsyncing the directory to ensure that if the system crashes after renaming the temporary file over the old file, the rename can't be reverted. However this is slower than not fsyncing the directory, is not necessary to prevent corruption on Linux, and is not possible on Windows.

https://github.com/untitaker/rust-atomicwrites is a crate that performs atomic file writing. However it performs a directory fsync on Linux, which I think is unnecessary, slow, and inconsistent between platforms. I would rewrite or vendor the crate without performing a directory fsync.

nyanpasu64 avatar Jun 13 '21 22:06 nyanpasu64

Thank you for your report.

You're probably right, and you're definitively right about the open-file-then-serialize problem. I'd welcome a patch for this which reorders the steps to first-serialize-then-open-file.

About the other things you mentioned: I've never experienced such an issue, but I see where this is coming from. I'd love to see patches/improvement over the current state, although I'm not 100% sure whether the rename approach solves the issue (I have to read up on the fsync stuff you mentioned).

matthiasbeyer avatar Jun 14 '21 12:06 matthiasbeyer

Thanks for the detailed report.

I personally don’t like atomics writes (temp file + rename) because of the permissions issue you talked about. I even had an issue with atomic writes over ssh that created the temp file but where not able to rename it, so the old one stayed.

I do not think reading from multiples programs is a problem since confy is intended to store and load configuration files. To me config files are for one program only and not written much often. Sometimes they need to have specific permissions to protect secrets stored in them, and I think confy should be an easy to use crate for most peoples. So forcing them to wrap their call to store_path in a function that will redo the permissions is not nice. Adding parameter to set permissions ourselves is not good either.

Maybe we can have an atomic-write feature ? So it will be opt-in at compile time for each program.

Zykino avatar Jun 19 '21 11:06 Zykino

I feel that adding an atomic-write feature is flawed because if dependency enables it, it also gets turned on for other dependencies or the binary using the same version of confy.

Perhaps add a separate function called store_path_atomic, and document how non-atomic writes may leave a corrupted file, and atomic writes have permission issues? Though I'm not sure how to implement both without copy-pasting the let s; s = ... logic. Maybe extract it to a helper function?

I wish the choice was simpler for app developers. I wish there was a solution without either disk corruption or renaming issues. But there isn't, and I don't how we could get one. (Maybe transactional NTFS could help, but Microsoft is killing it off.)

Maybe if these issues are a concern, app settings should instead be stored in a concurrent-safe binary database, like the Windows registry or SQLite?

nyanpasu64 avatar Jun 19 '21 20:06 nyanpasu64

Partly resolves #47.

Apparently GitHub treated that PR as automatically closing this issue. I think this issue should be reopened, because supplying an API to atomically save a config file is still an unresolved bug/feature request.

nyanpasu64 avatar Jun 25 '21 15:06 nyanpasu64

I'm hit by this issue as well. Would you guys welcome a PR that adds an option to confy::store in order to make it atomic (write-rename)?

joelthelion avatar Feb 15 '22 11:02 joelthelion

yes you can submit a PR that adds it

On Tue, Feb 15, 2022 at 12:51 PM Joel Schaerer @.***> wrote:

I'm hit by this issue as well. Would you guys welcome a PR that adds an option to confy::store in order to make it atomic (write-rename)?

— Reply to this email directly, view it on GitHub https://github.com/rust-cli/confy/issues/47#issuecomment-1040182371, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC6MJNJO4EXGC3YNUOUCRQLU3I44PANCNFSM46UEUA5Q . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Dylan-DPC-zz avatar Feb 16 '22 14:02 Dylan-DPC-zz