systemd killall: resume storage daemons

killall causes a traversal of /proc. When the root filesystem is backed by a storage daemon (fuse, NBD), and said storage daemon is suspended (SIGSTOP), a deadlock may occur. Resume storage daemons before invoking killall.

Mar 27 '22 09:03 achille-fouilleul

Hmm, why bother? If you suspend your root storage daemons you kinda are asking the system pretty explicitly that you want the system to hang?

Mar 28 '22 12:03 poettering

I am not convinced we want this. It seems entirely unnecessary? root storage daemons are outside of our management, and I see no case for changing that for a case that doesn't happen unless the user kinda asks for something crazy. The complexity and code this adds appears entirely unwarranted to me.

Hence, can you explain why we should bother with this?

Mar 28 '22 12:03 poettering

I am running a custom system where the root filesystem is backed by a fuse daemon (https://systemd.io/ROOT_STORAGE_DAEMONS/). During shutdown or suspend the system freezes consistently. To diagnose the issue I added log statements until it was clear that the lock-up occurred while in the killall function. As stated in the log message, since killall traverses the /proc directory it needs the fuse daemon to respond to readdir requests on /. The daemon cannot service these requests if it is suspended.

Mar 28 '22 13:03 achille-fouilleul

but why was that fuse daemon suspended? if you susped your fuse daemon things will hang, that's hardly surprising? And maybe the lesson to learn there is not to suspend the fuse daemon if you need it?

Mar 28 '22 13:03 poettering

but why was that fuse daemon suspended?

kill(-1, SIGSTOP) in broadcast_signal does it, right before calling killall.

Mar 28 '22 13:03 achille-fouilleul

hmm, i see, indeed. but it sounds wrong to resume stuff that probably shouldn't have been stopped in the first place.

it appears to me the fix should be to call killall() with SIGSTOP first, instead off kill(-1, SIGSTOP). And similar for resume.

Mar 28 '22 13:03 poettering

kill(-1, ...) is nice because it is atomic, I think; I'm concerned using killall to suspend the processes could miss some processes, if some process forks while killall enumerates /proc.

Mar 28 '22 14:03 achille-fouilleul

kill(-1, ...) is nice because it is atomic, I think

It's not though. The interface has this nice atomic smell to it, but in the kernel it's just a loop that races against everything else the same way as you do it in userspace.

Mar 28 '22 14:03 poettering

kill(-1, ...) is nice because it is atomic, I think

It's not though. The interface has this nice atomic smell to it, but in the kernel it's just a loop that races against everything else the same way as you do it in userspace.

The linux implementation of kill(-1, signal) holds tasklist_lock for reading for the loop duration.

https://github.com/torvalds/linux/blob/v5.17/kernel/signal.c#L1599-L1619

Mar 28 '22 16:03 vcaputo

The linux implementation of kill(-1, signal) holds tasklist_lock for reading for the loop duration.

https://github.com/torvalds/linux/blob/v5.17/kernel/signal.c#L1599-L1619

interesting...

Apr 01 '22 15:04 poettering

so, the issue is traversal of the root fs, right?

so what about this: open an fd to /proc/ first, and only use that for traversal. That way we should not need to open / again. Wouldn#t that suffice?

Apr 01 '22 15:04 poettering

open an fd to /proc/ first, and only use that for traversal. That way we should not need to open / again.

I had the same idea a couple of minutes ago. I will try to find some time to try this this weekend and let you know.

Apr 01 '22 15:04 achille-fouilleul

ping?

Aug 03 '22 09:08 poettering

I have been working on this on and off (more off than on TBF). I realized along the way that a path other than /proc/* was accessed (/etc/initrd-release). I like the openat approach to access /proc but it has led to a larger change: https://github.com/achille-fouilleul/systemd/tree/afoui/killall-open-proc-before-suspending-all-processes I ran my tests using a casync fuse mount. What I would like to do now is to have the code exercised by the test suite.

Aug 03 '22 19:08 achille-fouilleul

systemd systemd copied to clipboard

killall: resume storage daemons

systemd
systemd copied to clipboard