void-runit
void-runit copied to clipboard
Tidy up runsvdir symlinks during runit phase 1, so that the preparing…
… execution of runsvchdir in phase 2 doesn't choke, if those symlinks are broken.
I just stumbled over my system not being able to fully enter runit phase 2, due to me messing around with the service directories and leaving behind broken current and default symlinks. This little patch adds extra cleanup that tidies up by always entering phase 2 with current → default and previous being removed.
Is there a chance this sort of issue could happen in usual usage? More importantly, how does this work if the file system is read only?
I had a lengthy response typed out, in the browser, then wanted to test something with runit, which promptly killed my user session, and the stuff I wrote. Don't want to type it out again, so I'll keep it brief, feel free to ask for clarification…
Is there a chance this sort of issue could happen in usual usage?
Yes, if you're cleaning up old runlevels and are clumsy about it. Happens usually in already stressfull situations and you want to remove that potential footgun. Keep in mind that this kind of situation will also prevent the selection of the destination runlevel via the kernel command line. This is a problem, if that's your default path of recovering from boot time issues that happen after the kernel launched PID=1.
More importantly, how does this work if the file system is read only?
It will lead to a couple of error messages at boot and that's it. Zero negative impact. However it should be noted that runsvchdir itself can't operate if /etc/runit/runsvdir resides on a readonly filesystem (so having / readonly will also prevent kernel command line target runlevel selection!).
Strictly speaking not initializing /etc/runit/runsvdir to a well defined state upon boot makes graceful error recovery unneccesarily hard. This in fact also applies to the read-only situation. The actual solution would be to have runsvchdir create the symlinks directly in /run/runit/runsvdir dereferencing to the directories in /etc/runit/runsvdir. That would also in the same stroke deal with the broken/dangling current/previous symlink problem I seeked to address.
Maybe we can do some weird bind-mounting of symlinks. But the more I think about it, the more I'd rather see runsvchdir to be modified.
I don't have a strong opinion on this.
But, punish everyone with unlink(2) and link(2) on
every startup because some people messed up is not nice.
And this will break stateless /etc
This isn't the right fix. What you ought to do is
ln -sf "${runlevel}" /etc/runit/runsvdir/current
in 2 in place of the runsvchdir "${runlevel}" that is there now. We aren't really changing the runlevel in 2, we are setting it, so the migration that is imposed in runsvchdir is inappropriate. Also, runsvchdir verifies the existence of the target runlevel directory, but we've already done that in the cmdline loop in 2.
If you really care about a broken previous symlink, you could add rm -f /etc/runit/runsvdir/previous (there is no need to test for existence and remove, just force remove) right before the current symlink is made, but why bother? A broken previous symlink is irrelevant.
But, punish everyone with
unlink(2)andlink(2)
How is that in any way punishing? Those syscalls are dirt cheap when done on symlinks.
And this will break stateless
/etc
It doesn't take my patch for do this, it's already happening as part of core-services: https://github.com/void-linux/void-runit/blob/8ab6d402b5d6d134dc3ac59742220107911d7750/core-services/99-cleanup.sh#L10
On 2020-08-03 09:11:00-0700, datenwolf [email protected] wrote:
But, punish everyone with
unlink(2)andlink(2)How is that in any way punishing? Those syscalls are dirt cheap when done on symlinks.
And this will break stateless
/etcIt doesn't take my patch for do this, it's already happening as part of
core-services: https://github.com/void-linux/void-runit/blob/8ab6d402b5d6d134dc3ac59742220107911d7750/core-services/99-cleanup.sh#L10
Ah, yes, I was wrong. ahesford's suggestion would be better nonetheless.
-- Danh
On 2020-08-03 09:10:28-0700, "Andrew J. Hesford" [email protected] wrote:
This isn't the right fix. What you ought to do is
ln -sf "${runlevel}" /etc/runit/runsvdir/currentin
2in place of therunsvchdir "${runlevel}"that is there now. We aren't really changing the runlevel in2, we are setting it, so the migration that is imposed inrunsvchdiris inappropriate. Also,runsvchdirverifies the existence of the target runlevel directory, but we've already done that in thecmdlineloop in2.If you really care about a broken
previoussymlink, you could addrm -f /etc/runit/runsvdir/previous(there is no need to test for existence and remove, just force remove) right before thecurrentsymlink is made, but why bother? A brokenprevioussymlink is irrelevant.
previous symlink doesn't matter, anyway,
it's used as placeholder for the old current (to rename back)
in case of runsvchdir failes to rename current.new to current.
-- Danh