tokio
tokio copied to clipboard
Add pidfd-based implementation of `tokio::process`
Is your feature request related to a problem? Please describe.
The current unix implementation of tokio::process
relies on handling SIGCHLD
signals. This requires that the corresponding signal handler be properly set up - if another library in the process modifies it, it will be impossible to properly wait on child processes.
Describe the solution you'd like
On recent versions of Linux, it's possible to use pidfd
s to avoid this problem. Using the recently merged libstd support (https://github.com/rust-lang/rust/pull/81825), we can create a pidfd at the same time that a child is spawned. The pidfd can then be waited on using epoll
or select
(with an optional timeout).
If the running kernel version does not support pidfds, we can fall back to the existing unix implementation.
Describe alternatives you've considered Do nothing, and continue to use the unix implementation in all cases.
@ipetkov
Worth noting that we may need to wait for our MSRV includes this change before we can roll it out as a feature
I think this could be gated behind some kind of nightly
feature to allow users to get early access to it.
The feature has already been around in a crate, so there's no need for wait for a Rust compiler update. Trivial to implement by scratch as well.
Getting the full benefits of this feature (eliminating the issue of pidfd wraparound, and removing the dependency on signal handlers) requires going through the standard library, so that we can obtain a pidfd at the same time the child process is spawned.
It is safe to obtain a pidfd using the pidfd_open
syscall right after child creation. PID can only be recycled after the child has been wait
-ed for. AFAIK the only downside is kernel support: clone has CLONE_PIDFD
flag since 5.2, and pidfd_open
is only available in 5.3
PID can only be recycled after the child has been wait-ed for.
Yes, but calling wait(-1)
will wait on the children of all threads, not just the current thread. If a Rust program includes a C library that calls wait
(or if the Rust program is used as a library in a larger C program), then PID recycling is no longer under the control of the Rust code.
I recognize that this is an edge case - however, recent versions of Linux now have all of the tools needed to wait for child processes in a truly 'self-contained' manner, which does not need to make any assumptions about other code running in the same process. I think it would be great to be able to take advantage of that in Tokio, when the kernel version in use supports it.
Getting the full benefits of this feature (eliminating the issue of pidfd wraparound, and removing the dependency on signal handlers) requires going through the standard library, so that we can obtain a pidfd at the same time the child process is spawned.
There's nothing that they can do that isn't already being done in the two pidfd crates. You obtain a pidfd after a process has been spawned, the kernel guarantees that the child's PID will remain valid until three child is reaped by waiting on the process.
I use this crate in a lot of my projects and have never had to use signal handlers.
Which Rust version was this released in?
There's nothing that they can do that isn't already being done in the two pidfd crates. You obtain a pidfd after a process has been spawned, the kernel guarantees that the child's PID will remain valid until three child is reaped by waiting on the process.
The reaping can happen by some other code outside of the user's control (e.g. a background thread). Creating the pidfd along with the process is the only approach that is guaranteed to be free from race conditions in the face of additional calls to waitpid()
outside of the Rust code's control.
Tokio already has its own File, Command, and Child types, so I don't see how this would require std support. Tokio can create the pidfd at the same it creates the child (or after because that's perfectly fine), with a stable version of Rust from 1+ years ago.
The Command
struct is a wrapper around std::process::Command
, and uses the builtin spawn
method:
https://github.com/tokio-rs/tokio/blob/cf38ba627abb0d44ae7e7dc0659ddd51866e9efa/tokio/src/process/unix/mod.rs#L103
As long as the standard library spawn
method is used, we need to use the standard library support to create the pidfd at the same time.
We don't have to rely on the standard library to add convenience methods for this. It could take another year or two before they stabilize it, and longer to get into a stable release.
We can do what you want with functionality already readily available. Because this is a Linux-only feature, we don't have to avoid using the CommandExt unix module for access to pre_exec, and we don't have to shy away from using libc for the pidfd and clone3 syscalls.
What Tokio is already doing is already unreliable. Using SIGCHLD handlers to monitor a process is even worse than upgrading the PID to a PidFd after it has spawned.
Note that using a pidfd will cause libstd to fall back to the relatively slow fork
/exec
over the faster posix_spawn
. The difference is quite measurable (2-3x for the clap repo, for example) in nextest.
I believe https://github.com/tokio-rs/tokio/pull/6152 addresses this, in a way that still uses vfork.