void-runit icon indicating copy to clipboard operation
void-runit copied to clipboard

Tidy up runsvdir symlinks during runit phase 1, so that the preparing…

Open datenwolf opened this issue 5 years ago • 7 comments

… execution of runsvchdir in phase 2 doesn't choke, if those symlinks are broken.


I just stumbled over my system not being able to fully enter runit phase 2, due to me messing around with the service directories and leaving behind broken current and default symlinks. This little patch adds extra cleanup that tidies up by always entering phase 2 with current → default and previous being removed.

datenwolf avatar Aug 01 '20 13:08 datenwolf

Is there a chance this sort of issue could happen in usual usage? More importantly, how does this work if the file system is read only?

ericonr avatar Aug 01 '20 21:08 ericonr

I had a lengthy response typed out, in the browser, then wanted to test something with runit, which promptly killed my user session, and the stuff I wrote. Don't want to type it out again, so I'll keep it brief, feel free to ask for clarification…

Is there a chance this sort of issue could happen in usual usage?

Yes, if you're cleaning up old runlevels and are clumsy about it. Happens usually in already stressfull situations and you want to remove that potential footgun. Keep in mind that this kind of situation will also prevent the selection of the destination runlevel via the kernel command line. This is a problem, if that's your default path of recovering from boot time issues that happen after the kernel launched PID=1.

More importantly, how does this work if the file system is read only?

It will lead to a couple of error messages at boot and that's it. Zero negative impact. However it should be noted that runsvchdir itself can't operate if /etc/runit/runsvdir resides on a readonly filesystem (so having / readonly will also prevent kernel command line target runlevel selection!).

Strictly speaking not initializing /etc/runit/runsvdir to a well defined state upon boot makes graceful error recovery unneccesarily hard. This in fact also applies to the read-only situation. The actual solution would be to have runsvchdir create the symlinks directly in /run/runit/runsvdir dereferencing to the directories in /etc/runit/runsvdir. That would also in the same stroke deal with the broken/dangling current/previous symlink problem I seeked to address.

Maybe we can do some weird bind-mounting of symlinks. But the more I think about it, the more I'd rather see runsvchdir to be modified.

datenwolf avatar Aug 03 '20 15:08 datenwolf

I don't have a strong opinion on this.

But, punish everyone with unlink(2) and link(2) on every startup because some people messed up is not nice.

And this will break stateless /etc

sgn avatar Aug 03 '20 16:08 sgn

This isn't the right fix. What you ought to do is

ln -sf "${runlevel}" /etc/runit/runsvdir/current

in 2 in place of the runsvchdir "${runlevel}" that is there now. We aren't really changing the runlevel in 2, we are setting it, so the migration that is imposed in runsvchdir is inappropriate. Also, runsvchdir verifies the existence of the target runlevel directory, but we've already done that in the cmdline loop in 2.

If you really care about a broken previous symlink, you could add rm -f /etc/runit/runsvdir/previous (there is no need to test for existence and remove, just force remove) right before the current symlink is made, but why bother? A broken previous symlink is irrelevant.

ahesford avatar Aug 03 '20 16:08 ahesford

But, punish everyone with unlink(2) and link(2)

How is that in any way punishing? Those syscalls are dirt cheap when done on symlinks.

And this will break stateless /etc

It doesn't take my patch for do this, it's already happening as part of core-services: https://github.com/void-linux/void-runit/blob/8ab6d402b5d6d134dc3ac59742220107911d7750/core-services/99-cleanup.sh#L10

datenwolf avatar Aug 03 '20 16:08 datenwolf

On 2020-08-03 09:11:00-0700, datenwolf [email protected] wrote:

But, punish everyone with unlink(2) and link(2)

How is that in any way punishing? Those syscalls are dirt cheap when done on symlinks.

And this will break stateless /etc

It doesn't take my patch for do this, it's already happening as part of core-services: https://github.com/void-linux/void-runit/blob/8ab6d402b5d6d134dc3ac59742220107911d7750/core-services/99-cleanup.sh#L10

Ah, yes, I was wrong. ahesford's suggestion would be better nonetheless.

-- Danh

sgn avatar Aug 03 '20 23:08 sgn

On 2020-08-03 09:10:28-0700, "Andrew J. Hesford" [email protected] wrote:

This isn't the right fix. What you ought to do is

ln -sf "${runlevel}" /etc/runit/runsvdir/current

in 2 in place of the runsvchdir "${runlevel}" that is there now. We aren't really changing the runlevel in 2, we are setting it, so the migration that is imposed in runsvchdir is inappropriate. Also, runsvchdir verifies the existence of the target runlevel directory, but we've already done that in the cmdline loop in 2.

If you really care about a broken previous symlink, you could add rm -f /etc/runit/runsvdir/previous (there is no need to test for existence and remove, just force remove) right before the current symlink is made, but why bother? A broken previous symlink is irrelevant.

previous symlink doesn't matter, anyway, it's used as placeholder for the old current (to rename back) in case of runsvchdir failes to rename current.new to current.

-- Danh

sgn avatar Aug 04 '20 01:08 sgn