rust Current handling of Unix close() can lead to silent data loss

While working on #98209 , I went searching for where close() is called. I found it in std/src/os/fd/owned.rs:

impl Drop for OwnedFd {
    #[inline]
    fn drop(&mut self) {
        unsafe {
            // Note that errors are ignored when closing a file descriptor. The
            // reason for this is that if an error occurs we don't actually know if
            // the file descriptor was closed or not, and if we retried (for
            // something like EINTR), we might close another valid file descriptor
            // opened after we closed ours.
            let _ = libc::close(self.fd);
        }
    }
}

The Linux close(2) manpage states:

       A  careful programmer will check the return value of close(), since it is quite possi‐
       ble that errors on a previous write(2)  operation  are  reported  only  on  the  final
       close()  that  releases  the open file description.  Failing to check the return value
       when closing a file may lead to silent loss of data.  This can especially be  observed
       with NFS and with disk quota.

       Note,  however,  that  a  failure  return  should be used only for diagnostic purposes
       (i.e., a warning to the application that there may still be I/O pending or  there  may
       have been failed I/O) or remedial purposes (e.g., writing the file once more or creat‐
       ing a backup).

       Retrying the close() after a failure return is the wrong thing to do, since  this  may

Anyhow, checking the return value of close(2) is necessary in a number of cases. But here we have a conundrum, because what can we possibly do with a failure of close(2) inside a Drop impl?

Ignore it as now
Panic
More complex options (eg, allow the caller to register a close-failure callback with the underlying File/whatever object)

These all have their pros and cons. But aren't we looking for something more like this?

fn close(self) -> io::Result<()>

In fact, a Close trait could define this function and it could be implemented on files, pipes, sockets, etc.

Meta

rustc --version --verbose:

rustc 1.56.1 (59eed8a2a 2021-11-01)
binary: rustc
commit-hash: 59eed8a2aac0230a8b53e89d4e99d55912ba6b35
commit-date: 2021-11-01
host: x86_64-unknown-linux-gnu
release: 1.56.1
LLVM version: 13.0.0

Jun 21 '22 13:06 jgoerzen

If - assuming we're talking about regular files here - you need durability then relying on close is the wrong thing because data may only make it to the page cache and later encounter an error during writeback. You should rely on fsync instead.

let file = File::from(owned_fd);
file.sync_all()?;

Of course that's expensive since it forces the OS to do immediate writeback, so it should only be done when durability is critical.

Jun 21 '22 16:06 the8472

I think it makes sense to have a close method for OwnedFd, OwnedHandle, etc if only for diagnostic purposes. There's not a whole bunch you can do with the error if you get in that situation, other than logging or maybe restarting the whole operation over again. But maybe it can help diagnose a problem.

The API proposal is just drop but doesn't swallow the error, right?

impl OwnedFd {
    fn close(self) -> io::Result<()>;
}

Jun 21 '22 16:06 ChrisDenton

@the8472 close(2) on Linux even states:

       A  successful  close  does not guarantee that the data has been success‐
       fully saved to disk, as the  kernel  uses  the  buffer  cache  to  defer
       writes.   Typically,  filesystems  do  not  flush buffers when a file is
       closed.  If you need to be sure that the data is  physically  stored  on
       the underlying disk, use fsync(2).  (It will depend on the disk hardware
       at this point.)

But it also includes the comment I quoted above about the importance of still checking the return value of close().

@ChrisDenton A scenario in which an error here may be relevant could be this:

Let's say you are processing items from a queue of some sort. If processing is successful, you remove the item from the queue. If not, it remains on the queue. An error from close would be a reason to leave the item on the queue, or to signal failure to the network client, etc.

I don't know the particulars of the NFS issue, but I can imagine them. I also imagine various FUSE filesystems (eg, s3fs) do the bulk of their work only when close is called.

Another scenario I could imagine would involve pipes. If a prior write() put data into the pipe buffer, but the process on the other end of the pipe crashed before reading it, the only way to detect this would be via SIGPIPE (which is blocked by default in Rust; see #97889) or, I would imagine, by the return value of close(). (I have not personally verified this)

@ChrisDenton Yes, you are correct about my proposal. It should consume and cause the dropping of the OwnedFd because, as the close(2) manpage notes, you really shouldn't attempt to close it again, even on error (with a side asterisk that certain Unices, such as HP/UX, you SHOULD close it after EINTR, but this is ambiguous in POSIX and is not the case on Linux at least; I'd argue that if somebody ports Rust to one of those OSs, EINTR handling should be internal to fn close there.)

I would also like to see it bubbled up to the higher-level interfaces (File, sockets, etc). This shouldn't be difficult to use in cross-platform code, I'd hope. I don't know if Windows has the same close() semantics, but even if on Windows it is identical to drop, having an interface available would be helpful.

The one trick I'd see would be preventing a double-close when there is a call to fn close (by drop).

Jun 21 '22 16:06 jgoerzen

But it also includes the comment I quoted above about the importance of still checking the return value of close().

My understanding is that fsync subsumes all that as long as the system defines _POSIX_SYNCHRONIZED_IO, which is the case on all major unix systems. E.g. postgres heavily relies on fsync for ACID guarantees, not close.

Looking at the return value can only ever be a diagnostic thing. Unless you're opening a file with O_SYNC or O_DIRECT close doesn't guarantee anything due to writeback.

I don't know the particulars of the NFS issue, but I can imagine them. I also imagine various FUSE filesystems (eg, s3fs) do the bulk of their work only when close is called.

NFS supports fsync. Anything that ignores fsync is not a durable filesystem and you shouldn't entrust critical data to it.

Another scenario I could imagine would involve pipes. If a prior write() put data into the pipe buffer, but the process on the other end of the pipe crashed before reading it, the only way to detect this would be via SIGPIPE (which is blocked by default in Rust; see https://github.com/rust-lang/rust/issues/97889) or, I would imagine, by the return value of close(). (I have not personally verified this)

Close has no special treatment for pipes. Any data still stuck in a pipe is lost.

Jun 21 '22 17:06 the8472

I also imagine various FUSE filesystems (eg, s3fs) do the bulk of their work only when close is called.

FUSE for its part documents that filesystems should avoid doing this.

Jun 22 '22 02:06 sunfishcode

@the8472 and @sunfishcode I would like to understand -

Are you folks opposed to having an explicit close() that returns an error?
Or are you fine with that, but dislike my justification?

If 1, how do you reconcile that with the strong admonishment in close(2) that "failing to check the return value when closing a file may lead to silent loss of data"?

If 2, I'd like to hear more.

Over at https://stackoverflow.com/questions/24477740/what-are-the-reasons-to-check-for-error-on-close there is a conversation about this which I found interesting. Among the points are:

"Consider the inverse of your question: "Under what situations can we guarantee that close will succeed?" The answer is: when you call it correctly, and when you know that the file system the file is on does not return errors from close in this OS and Kernel version"

It is this second that has me particularly troubled. On my Debian Linux box, there are 58 in-kernel filesystem modules and dozens more implemented via FUSE. Some of them (eg, ext4) are specifically designed for POSIX semantics. Some of them (eg, cifs, exfat) were not. I use sshfs and NTFS via FUSE on a regular basis and who knows about them; I am certain they are not fully POSIX compliant. Even NFS has never been fully compliant with POSIX semantics, only getting somewhat better in more recent years.

I'm not arguing for everything to be perfect here; in a strict sense, if one wants perfect ACID, one would not just fsync, but also check the return value of fsync and close, and fsync the parent directory after a rename also. Except on MacOS, where fsync doesn't actually force data to durable storage, unlike Linux, where fsync explicitly does.

I just want the tools to be a careful programmer wherever possible, that's all.

Jun 22 '22 13:06 jgoerzen

This is probably a separate issue, but an incidental remark: I don't see any way to fsync a directory in std, either. File::open() won't open a directory (perhaps via O_DIRECTORY in custom_flags, but there is nothing in std that exposes O_DIRECTORY), and read_dir doesn't expose the fd for syncing either.

Jun 22 '22 13:06 jgoerzen

@the8472 and @sunfishcode I would like to understand -
1. Are you folks opposed to having an explicit close() that returns an error?

I myself am.

If 1, how do you reconcile that with the strong admonishment in close(2) that "failing to check the return value when closing a file may lead to silent loss of data"?

Ignoring errors from close is so pervasive that filesystem developers that care about robustness are practically expected to make sure close never fails.

Rust has ignored errors from close since 1.0, which is a precedent across the ecosystem. If we change that, it will create a new expectation for almost all Rust libraries and utilities that write files. It's tempting to say "you don't need to check if your use case doesn't need it", but libraries and utilities usually don't pick the filesystems they run on, or how data is used outside their scope.

If we approach this issue as "Do we care about this kind of data loss?", it's difficult to explain saying no. But we can instead ask "Who is responsible for this kind of data loss?", and there are multiple options:

Declare that Rust libraries and utilities are now expected to check errors from close.
Declare that end users using NFS are responsible for actively monitoring how much free space they have, and that FUSE filesystem authors are responsible for following the FUSE documentation.

Sometimes fixing existing code to correctly handle errors from close is easy, but sometimes it's costly. Either way, the new code paths will be rarely exercised and difficult to test. There is an ecosystem-wide cost for adding this expectation.

To my knowledge, the main filesystems where close does fail in practice are networked filesystems which postpone reporting errors in order to go faster (NFS calls this "async"), and FUSE filesystems which ignore the documentation. End users using NFS and caring about the data being written to it should probably already be actively monitoring for free space and other conditions, because of the amount of existing code out there that doesn't check for these errors. FUSE authors that care about protecting their users' data should already be following the documentation for flush.

If NFS were as common as it once was, things might look different, but in practice, POSIX filesystem APIs, with all the guarantees and common assumptions, aren't a great fit for networks. Instead of exposing the async nature of networks to applications via mechanisms that could map to Rust's async, NFS' "async" mode tries to hide everything behind the illusion of a synchronous API. Filesystem APIs can be convenient, but they have significant costs.

Neither option is without downsides. I don't have data, but I assume the costs of passing on the responsibility for errors from close to individual developers throughout the Rust ecosystem would be greater than the cost of stating that this is not the ecosystem's responsibility, and passing the responsibility onto end users that choose to use NFS, and FUSE filesystem authors, who arguably already have this responsibility.

As a separate consideraion, stdout can be redirected to a file, and all the same data-loss scenarios are possible when writing to a file via stdout. If we add an expectation to Rust that code check for errors from close after writing to files, we should add a way to close stdout and check for errors from that too. But that's tricky, because Stdout::lock returns a &'static, which effectively guarantees that stdout is always open.

I just want the tools to be a careful programmer wherever possible, that's all.

This is why it's useful to make these decisions at a broad level, like Rust (or ideally higher, but one step at a time), so that we can collectively agree on what should be expected of individual careful programmers.

I also aspire to be a careful programmer, and I once added logic to LLVM to close stdout and check for errors, because LLVM is often used to write to stdout with redirection to files. I too read the close man page and believed it was my responsibility. That code lived in the LLVM codebase for years, but only with me defending it against a long series of people hitting problems with it and wanting to remove it. I eventually stopped defending it, and it has since been removed. It was just too much trouble to be worth the cost.

Jun 22 '22 15:06 sunfishcode

Are you folks opposed to having an explicit close() that returns an error? Or are you fine with that, but dislike my justification?

I'm ambivalent about adding the API. Generally it does not make sense to check for errors on close. But I can see some niche uses when hunting for errors or having a policy of failing as loudly and as explosively as possible then it might make sense to print an error and then abort when a close fails (what else are you going to do? there likely is no sane path to recovering from an error).

But I do dislike the justification. Checking for close is not what you do when you care about your data making it to storage intact. As I said earlier, fsync is the go-to mechanism here. To me it would be purely a diagnostic thing that something has gone horribly wrong.

File::open() won't open a directory

But it does? (playground)

Some of them (eg, ext4) are specifically designed for POSIX semantics. Some of them (eg, cifs, exfat) were not. I use sshfs and NTFS via FUSE on a regular basis and who knows about them; I am certain they are not fully POSIX compliant. Even NFS has never been fully compliant with POSIX semantics, only getting somewhat better in more recent years.

Sure but would you run a production database on a dodgy USB stick formatted with exfat?

Jun 22 '22 18:06 the8472

Thank you for these thoughtful comments.

I definitely do not want to create a precedent to require libraries to check close(). You are right that this would be unworkable. Already I wouldn't necessarily trust libraries with writing (are they doing the POSIX-safe way to do atomic writes, with fsyncs and renames? Almost uniformly not, but then you may not always want that).

I want the option to be able to check it myself in a safe way, that's all.

There are a lot of machines out there that are running on filesystems that I wouldn't count to apply these guarantees, or have other errors that may be signaled at close. FreeBSD can return ECONNRESET for instance. An entire class of devices (Raspeberry Pis) typically runs on dodgy micro SDs. I particular bane of my existance for years has been VMs (and even hardware) that fails to propagate fsync. I /frequently/ work with software that has to deal with USB sticks formatted with exfat or NTFS (or, heck, vfat). While NFS no doubt isn't as prolific as it once was, it is still with us, and now we have a bunch of others in that range: afs, coda, glusterfs, ocfs2, ceph, cifs, davfs, sshfs, s3fs, rclone -- to name a few. We have a /proliferation/ of filesystems or filesystem-like things, actually. It is an explicit design goal of the program I was working on when I created this issue to be maximally compatible with filesystems that differ from POSIX semantics in significant ways; for instance, s3fs, in which a file created may not be able to be opened for reading right away due to eventual consistency.

I care about data integrity. I am realistic that checking the return value from close is not itself sufficient to gaurantee it. I don't believe there exists a perfect guarantee, in fact, because hardware isn't perfect. But I want to take every step I reasonably can, and which documentation has showed I ought to, along that path -- in part because I realize I am dealing with filesystems with a lot more failure modes than ext4 on a local SSD.

I wouldn't try to run PostgreSQL atop S3. But for Filespooler, written in Rust, this is an explicit design goal and documented practice. Although written in Go, a similar tool, NNCP, is something I use for getting data to airgapped backup & archival systems.

Jun 22 '22 20:06 jgoerzen

There seems to be some separate scenarios here

posix-compliant FS -> use fsync, close doesn't matter
FS that always lie about comitting data -> neither fsync nor close will save you here, the data will remain in some volatile buffers for unknown amounts of time. I don't think much can be done about these other than not entrusting any valuable data to them or replacing them with better solutions.
things that ignore fsync but commit on close -> these obviously DO have a mechanism to ensure durability. Instead of burdening every single downstream consumer with using close-as-fsync they should instead be fixed by doing doing the same on fsync as they do on close
file descriptors that aren't regular files where fsync doesn't apply (such as the freebsd socket case you mention) -> that's going to be depend to the type of FD and the platform you're dealing with. E.g. on sockets you could do a dance involving shutdown on your end and then waiting for the other side to terminate the connection before you call close then there shouldn't be anything left on the wire that could possibly cause close to fail.

I definitely do not want to create a precedent to require libraries to check close(). You are right that this would be unworkable.

Ok. Then imo it should be documented in a way that it redirects users to sync_all if they want durability and detect IO errors and basically discourage its behavior due to being non-portable and FS-dependent.

Jun 22 '22 21:06 the8472

Prior discussions, sorted from oldest to newest:

https://internals.rust-lang.org/t/fs-file-should-panic-if-implicit-close-fails-and-no-panic-is-active/1349
https://github.com/rust-lang/rfcs/pull/770
https://github.com/rust-lang/rust/issues/24413
https://github.com/rust-lang/rust/pull/22849
https://github.com/rust-lang/rust/issues/32255
https://old.reddit.com/r/rust/comments/5o8zk7/using_stdfsfile_as_shown_in_the_docs_can_lead_to/
https://github.com/rust-cli/team/issues/28
https://github.com/rust-lang/rust/issues/59567

Mar 21 '23 00:03 tbu-

Rust has ignored errors from close since 1.0, which is a precedent across the ecosystem. If we change that, it will create a new expectation for almost all Rust libraries and utilities that write files. It's tempting to say "you don't need to check if your use case doesn't need it", but libraries and utilities usually don't pick the filesystems they run on, or how data is used outside their scope.

Apparently we actually were debug-asserting on this since 1.0 and the only reason it's not come up is due to the precompiled stdlib thing.

Apr 18 '24 07:04 workingjubilee

The new issue is about closing ReadDir, not for File or OwnedFd, they're different impls. But as noted there we probably should assert on EBADF instead of ignoring errors since it indicates a likely io-safety violation.

Apr 18 '24 10:04 the8472

Have there been any developments on this issue?

The current state of things seems a bit like a ticking time bomb. Idiomatic usage of standard library file API-s can lead to silent data loss due to ignored close errors. While fsync is available in the standard library, I suspect its use will never be widespread, because the performance hit is simply too big. In practice, I think having an idiomatic close method with error reporting would take care of a lot of low hanging fruit when it comes to preventing silent data loss, without any observable performance impact.

Nov 24 '25 05:11 FeldrinH

As discussed by previous comments, error handling on close generally does not prevent data loss due to writeback caching and delayed errors. If you want to ensure durability you must use fsync (and possibly more complicated syscall dances) with the performance hit it entails, or use platform-specific APIs like sync_file_range and io_uring to initiate and await writeback asynchronously.

Nov 26 '25 17:11 the8472

My general goal is to write code that ensures eventual durability. As far as I know, if close returns no errors, then unless the system suffers power loss or catastrophic hardware failure, eventual durability is guaranteed for the vast majority of filesystems, including local and network file systems. Under these parameters checking close for errors is both necessary and sufficient to ensure eventual durability, and having a way to do it in the standard library would be valuable.

Nov 26 '25 18:11 FeldrinH

That's not the case. btrfs can run out of disk space and not know that it will do so until it tries to compress and writeback the data, which may happen after the close. xfs and ext4 have delayed allocation too. Network filesystems may fail writeback if they lose the connection after you close the file, and the underlying server-side filesystem may encounter the same problems mentioned before, ... I'd only consider a disk dying or power loss something that would qualify as catastrophic, the other things are more like sysadmin annoyances.

There's really nothing special about close in general. The errors that the OS can report without doing writeback will already have been reported on write itself. close might report some additional errors if some writeback happened between the last write and close. But otherwise it's pretty close to a noop.

Nov 26 '25 18:11 the8472

That's not the case. btrfs can run out of disk space and not know that it will do so until it tries to compress and writeback the data, which may happen after the close. xfs and ext4 have delayed allocation too.

I will concede that the number of issues that can cause data loss after successful close is larger than I initially thought.

Network filesystems may fail writeback if they lose the connection after you close the file.

Really? This one I have never heard of. Any specific examples of filesystems that do this?

There's really nothing special about close in general. The errors that the OS can report without doing writeback will already have been reported on write itself. close might report some additional errors if some writeback happened between the last write and close. But otherwise it's pretty close to a noop.

But we check for write errors, so by that logic it seems reasonable to me that we should also check for close errors and treat them the same as write errors.

I think this part of your comment gets at the crux of the issue for me: in general Rust is very diligent to check and handle every error that the OS reports to it, but closing files seems to be a special exception to this for no clear reason. As far as I can tell, errors during close are entirely possible and do happen in practice, so why was it chosen that they should be ignored whereas e.g. errors during writes are not ignored. Checking close for errors may not provide a strong well-defined guarantee, but it definitely provides more of a guarantee than not checking close for errors, and unlike fsync, it does so with negligible performance impact.

Nov 26 '25 18:11 FeldrinH

Also, as a point of reference, I looked into what some other mainstream languages do:

Java, Python, and presumably most other high-level languages turn close errors into exceptions, forcing the user to handle them.
C++ fstream::close reports errors by setting an internal file operation error bit. This may throw an exception if the file stream is configured to throw on internal file operation errors.
C and Go provide a close method that returns the error to the user, but it is up to the user to do something with it.

Overall, as far as I can tell, Rust is alone in choosing to automatically discard any error returned by close.

Nov 26 '25 18:11 FeldrinH

C++ fstream::close reports errors by setting an internal file operation error bit. This may throw an exception if the file stream is configured to throw on internal file operation errors.

My impression based on this StackOverflow answer is that C++ actually acts similarly to Rust here, even if you try to configure it otherwise:

(In an ideal world, one would simply call stream.exceptions(ios::failbit) beforehand and handle the exception that is thrown in an fstream’s destructor. But unfortunately exceptions in destructors are a broken concept in C++, and the fstream destructor will therefore never throw, so this way does’t work here.)

which also cites:

Destructor operations defined in the C++ standard library shall not throw exceptions. Every destructor in the C++ standard library shall behave as if it had a non-throwing exception specification.

(https://timsong-cpp.github.io/cppwp/n4950/res.on.exception.handling#3.sentence-1)

I think C++ is the closest to Rust here, in that Go/Java/Python don't have destructors -- files are leaked if not explicitly close()'d. And it sounds like it matches behavior with Rust as well, if I'm reading this language right.

Nov 26 '25 19:11 Mark-Simulacrum

Dropping an OwnedFd should mean that you are no longer interested in its state and should release state from the OS. If you want to assert something about that state you must do so before you drop it. Ignoring any diagnostic "errors" (as the manpage calls them) is the only sane option in Drop, since you can't reason about the File's state nor are you prepared to handle a panic.

There could be an OwnedFd::informative_drop(self) -> Result<()> though. But to recap, this would not assert anything specific about the data on the disk.

Nov 26 '25 19:11 WorldSEnder

Dropping an OwnedFd should mean that you are no longer interested in its state and should release state from the OS. If you want to assert something about that state you must do so before you drop it. Ignoring any diagnostic "errors" (as the manpage calls them) is the only sane option in Drop, since you can't reason about the File's state nor are you prepared to handle a panic.

To be clear, I am not advocating that dropping the file should attempt to somehow report or handle close errors. I think there should be an explicit File::close(self) -> Result<()> method that you can (and are encouraged to) call to close the file and retrieve any returned error.

Nov 26 '25 19:11 FeldrinH

My impression based on this StackOverflow answer is that C++ actually acts similarly to Rust here, even if you try to configure it otherwise:

If you just let the file stream go out of scope then yes, C++ discards errors, the same as Rust. However, the crucial difference is that C++ has a filestream::close method that you can call to retrieve any close errors, whereas the Rust standard library doesn't.

Nov 26 '25 19:11 FeldrinH

I think there should be an explicit File::close(self) -> Result<()> method that you can (and are encouraged to) call to close the file and retrieve any returned error.

The encouragement is the issue here. It would give the false impression that a successful close somehow means you have handled all possible errors for the lifecycle of a file.

To quote

I will concede that the number of issues that can cause data loss after successful close is larger than I initially thought.

This kind of misconception (and worse) is common, especially among beginners.

The portable options are either making sure the data is persisted (which can be complicated, look at the dances databases have to do) and handling errors, or accepting that writeback will happen after the file has been closed in which case error handling is least-effort.

There is middle-ground between those poles, but that's quite nuanced and very environment-/platform-specific. Trying to write software that is meaningful in the presence of errors from closed files would fall into that nuanced middle ground and is not something that would be recommended.

Nov 26 '25 20:11 the8472

At this point I think it just comes down to a value judgement where we disagree. I think that even in a least-effort situation we should attempt to handle all the errors that the OS returns to us (possibly by aborting with a user-facing error) and that includes the error returned by close.

Nov 26 '25 20:11 FeldrinH

The thing I don't see is an argument against adding something akin to the C++ filestream::close. As noted previously, there are many many different failure cases. Of course, FDs on Unix are used for more than just files, and it seems that a lot of the conversation here has been restricted to files. For TcpStream, we have shutdown as something of a proxy, but it would be less unwieldy to have a proper close that can be checked for errors. Even on things presenting as files, some filesystems, especially FUSE ones, present an interface to an underlying layer that is not random-access or easily mutable (ie, s3fs) and which actually do the meaningful part of their work during close().

As noted at the very top of this bug at https://github.com/rust-lang/rust/issues/98338#issue-1278423038 , the Linux manpage specifically notes that checking the return value of close() is important. While I agree that this does not catch everything, what seems to be happening here is a lot of people suggesting that the documentation is incorrect. This strikes me as dangerous, and difficult to prove universally.

Nov 26 '25 22:11 jgoerzen

At this point I think it just comes down to a value judgement where we disagree. I think that even in a least-effort situation we should attempt to handle all the errors that the OS returns to us (possibly by aborting with a user-facing error) and that includes the error returned by close.

hmm this isn't a simple "value judgement" matter -- this is a foundational matter here.

what's the value-proposition of rustlang itself? "rust is safer AND faster" than c/c++/etc

You should rely on fsync instead.

now... by simply telling this to OP, we're saying "rust is safer XOR faster" than c/c++/etc

to summarize (not a bot here):

for c/c++, it's 3 levels of guarantees:

fastest (no check)
fast (check close result - lets say 90%)
slowest but 100% (fsync)

but for rust (status quo) is limited to:

fastest (no check)
slowest (fsync)

... we can't be OK with this -- how can we persuade project-managers to use Rust for the next project, when they can simply point out this kind of stuff?

Nov 27 '25 00:11 devtempmac

@devtempmac I struggle to believe "ask a filesystem to do anything, then check a register" is something that makes a difference performance-wise. And Rust does not exist to make all programs perfectly "safe". It exists to combat certain forms of common vulnerabilities. IO safety is one of them, but has little to do with write durability. If a project manager doesn't care about the benefits of memory safety or the use of an ownership model, then sure, Rust sucks I guess.

The debate here is whether a close API which returned Result<(), io::Error> would improve logical correctness in Rust programs, or if it would be ritualistic checking to no purpose in the general case. We know filesystems can experience errors in highly asynchronous ways, as that is the main reason why filesystems, especially NFS, report errors on close: it is usually an error from a previous write. Thus a synchronous API cannot model the problem without an equally-strong "barrier" like fsync. That this barrier also requires a huge performance hit is a flaw of the underlying file APIs, as described by Dan Luu: https://danluu.com/deconstruct-files/

Anything else is misleading at best, as exemplified by this thread repeatedly seeing arguments like "close results in eventual durability on most filesystems" while also referencing the Linux manpage. The manpage says this about durability:

A careful programmer who wants to know about I/O errors may precede close() with a call to fsync(2).

So, closing a file descriptor, if we take a manpage for one operating system as the authority, explicitly doesn't constitute a promise to tell you about IO errors. IO errors, if they occur, can mean we have failed to achieve write durability. By saying we want something the manpage doesn't promise, we've erased the manpage as a source of authority.

I think we should be reluctant to speculate in ways like "oh, it addresses 90% of issues" when we've erased the ability of the last 10% to even appear in our statistics.

If we want File::close (or OwnedFd::close, or whatever) because we want people to write better Rust code for file handling, a stronger argument would be a roundup of the behaviors of filesystems with write-then-close. Preferably beyond just Linux, but at least for those we do have prior research.

The prior research

Dan Luu, Wesley Aptekar-Cassels, Kate Murphy. "Filesystem error handling". 2017. https://danluu.com/filesystem-errors/
Haryadi S. Gunawi, Cindy Rubio-González, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Ben Liblit. "EIO: Error Handling Is Occasionally Correct". 2008. https://www.usenix.org/legacy/event/fast08/tech/full_papers/gunawi/gunawi_html/index.html
Vijayan Prabhakaran, Lakshmi N. Bairavasundaram, Nitin Agrawal, Haryadi S. Gunawi, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. "IRON File Systems". 2005. https://research.cs.wisc.edu/wind/Publications/iron-sosp05.pdf

I think a simpler, stronger argument for File::close is that we can use it to expose information. Does that information have much relation to reality, or even to closing the file itself? We don't know! We can even document the function accordingly: "use this for funsies! try not to base any irreversible logic on this since it could have been an error from forever ago, just write! a message to your log, mmk? if you wanted to really handle issues, then you would have sync_all'd and then... well, probably just panic if that fails, or go back to some checkpoint, since that's all databases can do".

That all said, if someone wants to actually present their reasoning for adding new API instead of dithering about whether this issue is a bug or not, the appropriate thing to do is to open an ACP by filing a new issue over here: https://github.com/rust-lang/libs-team/issues

Nov 27 '25 20:11 workingjubilee

Opened an API change proposal: https://github.com/rust-lang/libs-team/issues/705.

Nov 28 '25 02:11 tbu-