sh sh deadlocking in child fork process due to logging

I recently had an issue where some tests I have which were invoking sh with _bg=True were deadlocking.

After investigating, I noticed this was happening inside the sh library, and related to logging inside of an at-fork handler. This is a minimal reproducible example of the issue:

# Run using python3.9 -mpytest --full-trace -o log_cli=true -s -vv --log-cli-level=DEBUG repro.py
import logging
import os

import sh


def test_sh_log_in_child_fork():
    logger = logging.getLogger()

    os.register_at_fork(after_in_child=lambda: logger.debug("in child"))

    procs = [sh.Command("true")(_bg=True) for _ in range(10)]
    for proc in procs:
        proc.wait()

Hitting this issue does require a bit of scheduling bad luck, I think the background thread needs to be interrupted while the lock is held. My guess is that the log line is being formatted. Then fork() needs to be called from the main thread. In practice though this seems to not be that rare. In my testing the above reproduction hits it every time, with only 10 iterations. I'm guessing some of the objects being logged take some time to format? I also think instead of using os.register_at_fork() you could shoot yourself in the foot via the preexec_fn argument.

I am not an expert on python internals - but generally calling fork() from a multi-threaded application has some pretty concerning risks. It's realistically not going to be possible to avoid the issue entirely since users will always be able to cause this themselves, but I think the sh library's background thread is making this problem far more likely than it needs to be.

Can this thread avoid logging / other lock-holding-operations?

I think it's also worth noting that at-fork handlers were only added in python3.7 - so they may not be that common yet

Jul 04 '22 17:07 adamncasey

Thanks for the detailed report!

The docs you linked to state that:

These calls are only made if control is expected to return to the Python interpreter.

However, that's not actually what happens in this case, since the after_in_child is called even though we start true as a background process. I failed to reproduce the issue with sleep 1 instead of true, so it's possible that the issue only surfaces when the child process in question exits very fast.

Out of curiosity, what leads you to think that the sh library's background thread causes this problem?

Jul 11 '22 12:07 ecederstrand

The docs you linked to state that:

These calls are only made if control is expected to return to the Python interpreter.

However, that's not actually what happens in this case, since the after_in_child is called even though we start true as a background process. I failed to reproduce the issue with sleep 1 instead of true, so it's possible that the issue only surfaces when the child process in question exits very fast.

These pre/post fork calls are made because os.fork is used instead of subprocess - subprocess does skip these calls because it 'knows' that after fork no python code is executed before exec (unless you specify a preexec_fn - see below for the warning associated with that) - and it only knows this because it has the fork/exec logic implemented in C. In sh it does need to return to the python interpreter between fork and exec.

https://github.com/python/cpython/blob/22b75d9bef1bffe82bfa1adfcbec0243c9202041/Modules/_posixsubprocess.c#L768-L774

I can't follow how os.fork is registered in cpython, but I'm guessing everything invoking plain os.fork is assumed to return back to python. I think using subprocess, or directly invoking this _posixsubprocess.fork_exec method is the only way to start a new process without risking deadlocks

Out of curiosity, what leads you to think that the sh library's background thread causes this problem?

I had a stacktrace which showed the child process stuck inside a logging related lock, and the background thread is also locking. I was able to reproduce this without sh only by having a background thread also logging.

Jul 27 '22 09:07 adamncasey

Thanks for the repro script and report, it was very helpful.

I am not an expert on python internals - but generally calling fork() from a multi-threaded application has some pretty concerning risks.

We can't control if the application using sh is multi-threaded or not, so sh must assume multi-threading in any case. I believe the main risk that sh needs to be concerned about is holding a lock before fork, and then expecting the lock to be opened in the child. The script that you posted demonstrates this problem. Most of what happens in the child between fork and exec is very minimal and mostly about closing down file descriptors, setting process ids, and setting signals. There is very little risk for the child to mess with anything threading or lock related, unless of course a user uses an escape hatch like os.register_at_fork or _preexec_fn. Using those hatches means you're doing pretty advanced stuff and should be prepared for dragons.

I'm inclined to say that the fix here is "don't do that". Although I can see how expecting to use logging within the child, before exec, could be a reasonable thing to expect. But it's not sh's responsibility to let users know that the logging module uses locks. I'm not even sure how we could communicate that risk back to the user, outside of a documentation change?

The only alternative that I see is to disable sh's logging altogether by default, unless a _debug_logging argument is passed explicitly to a command. That way, sh doesn't do any logging, and this deadlock-in-child risk goes away. It's not a bad idea imo, since I've suspected that all the extra sh logging has a performance impact.

Oct 21 '23 16:10 amoffat

In Python 3.12, a DeprecationWarning is issued in os.fork() when the forking process has multiple threads.

There is a well-referenced discussion linked in the docs, but the gist is: if a process that has multiple threads is forking, only the C API functions that are "async-signal-safe" can be used in the child[^1]. CPython does not guarantee it will only use those functions (e.g. it can call malloc), therefore forking while having multiple threads is unsafe.

I don't know what should be done about this (can sh be rewritten to not fork?), but I felt that this was relevant to this issue.

Now, does CPython actually call any unsafe functions in the child specifically while sh is forking but the child has not yet execed? It very much does:

strace -o threaded_fork.strace -ff python -c "import sh; sh.true()"
cat $(grep -l 'execve("/usr/bin/true' threaded_fork.strace.*)
# rm threaded_fork.strace.*

On Linux running CPython 3.11.6 I get the following trace:

trace

set_robust_list(0x74314b494a20, 24)     = 0
gettid()                                = 2805932
clock_gettime(CLOCK_MONOTONIC, {tv_sec=503190, tv_nsec=874804781}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=503190, tv_nsec=875002427}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=503190, tv_nsec=875625044}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=503190, tv_nsec=876523667}) = 0
getrandom("\xf2\x4b\x3c\xd4\x7d\xa1\x3f\xa2\xa0\x35\x69\xa0\x95\x28\x56\x45\x7d\x3d\x52\xa9\x8a\x1f\x0a\x49\x4c\xc1\x97\x41\x23\xe5\x54\x23"..., 2496, GRND_NONBLOCK) = 2496
fcntl(12, F_GETFD)                      = 0x1 (flags FD_CLOEXEC)
fcntl(12, F_SETFD, FD_CLOEXEC)          = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER|SA_ONSTACK, sa_restorer=0x74314ac5c710}, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_ONSTACK, sa_restorer=0x74314ac5c710}, 8) = 0
getsid(0)                               = 5767
getpgid(0)                              = 2805923
write(10, "5767,2805923", 12)           = 12
ioctl(6, TCGETS, {c_iflag=ICRNL|IXON, c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR, c_cflag=B38400|CS8|CREAD, c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE, ...}) = 0
ioctl(6, TCGETS, {c_iflag=ICRNL|IXON, c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR, c_cflag=B38400|CS8|CREAD, c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE, ...}) = 0
ioctl(6, TCSETSF, {c_iflag=, c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|ONLCR, c_cflag=B38400|CS8|CREAD, c_lflag=ECHOE|ECHOK|ECHOCTL|ECHOKE, ...}) = 0
close(4)                                = 0
close(5)                                = 0
close(7)                                = 0
close(9)                                = 0
close(11)                               = 0
dup2(3, 0)                              = 0
dup2(6, 1)                              = 1
dup2(8, 2)                              = 2
ioctl(1, TIOCSWINSZ, {ws_row=24, ws_col=80, ws_xpixel=0, ws_ypixel=0}) = 0
openat(AT_FDCWD, "/dev/fd", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 4
newfstatat(4, "", {st_mode=S_IFDIR|0500, st_size=9, ...}, AT_EMPTY_PATH) = 0
brk(0x5a1c358cb000)                     = 0x5a1c358cb000
getdents64(4, 0x5a1c358a2200 /* 11 entries */, 32768) = 264
getdents64(4, 0x5a1c358a2200 /* 0 entries */, 32768) = 0
close(4)                                = 0
close(3)                                = 0
close(4)                                = -1 EBADF (Bad file descriptor)
close(6)                                = 0
close(8)                                = 0
close(10)                               = 0
execve("/usr/bin/true", ["/usr/bin/true"], 0x7fff2b157738 /* 84 vars */) = 0
(rest omitted)

Out of those calls:

gettid is not POSIX, but getpid is async-signal-safe, so I don't see why it wouldn't be also;
getrandom is not POSIX, could be unsafe but I don't know;
rt_sigaction -- sigaction is async-signal-safe, rt_sigaction is the syscall used for it when the kernel provides it;
getsid is not async-signal-safe but we're calling it on ourselves so maybe it's fine?;
getpgid is not async-signal-safe but getpgrp is and it's probably the same syscall;
ioctl is not async-signal-safe;
newfstatat is fsstatat which is async-signal-safe;
brk is no longer POSIX (removed in POSIX.1-2001), probably not async-signal-safe since it's memory stuff, likely this part of code;
getdents64 is listdir which is not async-signal-safe, it's this part of code;
... and the rest are async-signal-safe.

Basically, this was a long way of saying that while the child code after the fork is indeed minimal, it's still unsafe according to POSIX. Whether that's something that needs addressing is another matter -- it seems to work fine in practice and that's what really matters, but perhaps it's something you might want to look into.

Cheers, and thank you for this amazing library!

[^1]: After a fork() in a multithreaded program, the child can safely call only async-signal-safe functions [...] until such time as it calls execve(). -- fork(2) - Linux manual page

Mar 04 '24 18:03 micha030201

@micha030201 Thank you for the deep dive and the excellent summary!

Mar 04 '24 21:03 amoffat

Curious if this is related to the issue I just saw, after upgrading to the new PyCharm 2024.1.

Code using sh and _async=True, which previously worked without issue suddenly fails when the debugger is active, with the output:

  File "<redacted>/lib/python3.12/site-packages/sh.py", line 2160, in __init__
    sid, pgid = (
    ^^^^^^^^^
ValueError: not enough values to unpack (expected 2, got 1)

Appears to be reading the sid/pgid immediately after a os.fork call, and the read() call returns an empty string in this environment, so sh explodes.

I can workaround this by downgrading PyCharm or not activating the debugger. But it seems like there is something unhappy in the intersection of async/fork/sh.

Apr 04 '24 22:04 corrylc

sh sh copied to clipboard

sh deadlocking in child fork process due to logging

sh
sh copied to clipboard