tokio icon indicating copy to clipboard operation
tokio copied to clipboard

Add pidfd-based implementation of `tokio::process`

Open Aaron1011 opened this issue 3 years ago • 14 comments

Is your feature request related to a problem? Please describe. The current unix implementation of tokio::process relies on handling SIGCHLD signals. This requires that the corresponding signal handler be properly set up - if another library in the process modifies it, it will be impossible to properly wait on child processes.

Describe the solution you'd like On recent versions of Linux, it's possible to use pidfds to avoid this problem. Using the recently merged libstd support (https://github.com/rust-lang/rust/pull/81825), we can create a pidfd at the same time that a child is spawned. The pidfd can then be waited on using epoll or select (with an optional timeout).

If the running kernel version does not support pidfds, we can fall back to the existing unix implementation.

Describe alternatives you've considered Do nothing, and continue to use the unix implementation in all cases.

Aaron1011 avatar Aug 02 '21 15:08 Aaron1011

@ipetkov

Darksonn avatar Aug 02 '21 15:08 Darksonn

Worth noting that we may need to wait for our MSRV includes this change before we can roll it out as a feature

ipetkov avatar Aug 03 '21 18:08 ipetkov

I think this could be gated behind some kind of nightly feature to allow users to get early access to it.

Aaron1011 avatar Aug 03 '21 19:08 Aaron1011

The feature has already been around in a crate, so there's no need for wait for a Rust compiler update. Trivial to implement by scratch as well.

mmstick avatar Aug 05 '21 14:08 mmstick

Getting the full benefits of this feature (eliminating the issue of pidfd wraparound, and removing the dependency on signal handlers) requires going through the standard library, so that we can obtain a pidfd at the same time the child process is spawned.

Aaron1011 avatar Aug 05 '21 16:08 Aaron1011

It is safe to obtain a pidfd using the pidfd_open syscall right after child creation. PID can only be recycled after the child has been wait-ed for. AFAIK the only downside is kernel support: clone has CLONE_PIDFD flag since 5.2, and pidfd_open is only available in 5.3

MikailBag avatar Aug 08 '21 14:08 MikailBag

PID can only be recycled after the child has been wait-ed for.

Yes, but calling wait(-1) will wait on the children of all threads, not just the current thread. If a Rust program includes a C library that calls wait (or if the Rust program is used as a library in a larger C program), then PID recycling is no longer under the control of the Rust code.

I recognize that this is an edge case - however, recent versions of Linux now have all of the tools needed to wait for child processes in a truly 'self-contained' manner, which does not need to make any assumptions about other code running in the same process. I think it would be great to be able to take advantage of that in Tokio, when the kernel version in use supports it.

Aaron1011 avatar Aug 22 '21 20:08 Aaron1011

Getting the full benefits of this feature (eliminating the issue of pidfd wraparound, and removing the dependency on signal handlers) requires going through the standard library, so that we can obtain a pidfd at the same time the child process is spawned.

There's nothing that they can do that isn't already being done in the two pidfd crates. You obtain a pidfd after a process has been spawned, the kernel guarantees that the child's PID will remain valid until three child is reaped by waiting on the process.

I use this crate in a lot of my projects and have never had to use signal handlers.

mmstick avatar Aug 22 '21 22:08 mmstick

Which Rust version was this released in?

Darksonn avatar Aug 23 '21 06:08 Darksonn

There's nothing that they can do that isn't already being done in the two pidfd crates. You obtain a pidfd after a process has been spawned, the kernel guarantees that the child's PID will remain valid until three child is reaped by waiting on the process.

The reaping can happen by some other code outside of the user's control (e.g. a background thread). Creating the pidfd along with the process is the only approach that is guaranteed to be free from race conditions in the face of additional calls to waitpid() outside of the Rust code's control.

Aaron1011 avatar Feb 09 '22 02:02 Aaron1011

Tokio already has its own File, Command, and Child types, so I don't see how this would require std support. Tokio can create the pidfd at the same it creates the child (or after because that's perfectly fine), with a stable version of Rust from 1+ years ago.

mmstick avatar Feb 09 '22 02:02 mmstick

The Command struct is a wrapper around std::process::Command, and uses the builtin spawn method:

https://github.com/tokio-rs/tokio/blob/cf38ba627abb0d44ae7e7dc0659ddd51866e9efa/tokio/src/process/unix/mod.rs#L103

As long as the standard library spawn method is used, we need to use the standard library support to create the pidfd at the same time.

Aaron1011 avatar Feb 09 '22 02:02 Aaron1011

We don't have to rely on the standard library to add convenience methods for this. It could take another year or two before they stabilize it, and longer to get into a stable release.

We can do what you want with functionality already readily available. Because this is a Linux-only feature, we don't have to avoid using the CommandExt unix module for access to pre_exec, and we don't have to shy away from using libc for the pidfd and clone3 syscalls.

What Tokio is already doing is already unreliable. Using SIGCHLD handlers to monitor a process is even worse than upgrading the PID to a PidFd after it has spawned.

mmstick avatar Feb 09 '22 12:02 mmstick

Note that using a pidfd will cause libstd to fall back to the relatively slow fork/exec over the faster posix_spawn. The difference is quite measurable (2-3x for the clap repo, for example) in nextest.

sunshowers avatar Aug 14 '22 00:08 sunshowers

I believe https://github.com/tokio-rs/tokio/pull/6152 addresses this, in a way that still uses vfork.

sunshowers avatar Feb 04 '24 00:02 sunshowers