zfs icon indicating copy to clipboard operation
zfs copied to clipboard

make it harder to overmount /

Open aep opened this issue 6 months ago • 2 comments

there's an ongoing issue in systemd https://github.com/systemd/systemd/issues/29559 that indicates lots of people accidentally mount two datasets on /. This is difficult to detect on linux because the kernel doesnt really care, but lots of userspace doesnt work anymore.

The optimal solution would be to refuse mounting a second / , but thats likely not a very popular idea. The second best thing would be to at least throw a big fat warning into dmesg so a sysadmin can at least detect the underlying problem more easily.

The second thing is trivial to implement. I can send a patch if there's consensus.

aep avatar Jun 03 '25 07:06 aep

I have some sympathy, and I'm not a fan of finger-pointing, but I really don't see how this is anything other than a systemd problem. systemd/systemd#37086 seems to indicate that it's a problem even when OpenZFS isn't in the mix. So I'm not super keen on working around their bug, at least until they've conclude that they won't fix it.

However, if there's more of an at-a-distance issue, say, OpenZFS itself mounted two datasets on top of each other and the user expressly did want or expect that, then I'm more interested in that. The systemd report is very long and I don't know all the moving parts involved, so a description and action focused on OpenZFS would be appreciated in this case.

robn avatar Jun 06 '25 03:06 robn

The systemd report is very long and I don't know all the moving parts involved, so a description and action focused on OpenZFS would be appreciated in this case

Thats fair, so my argument here is that the cause AND the effect are unexpected by most users, since they're unaware of the technicalities of VFS.

I can't speculate too much about how it happens for everyone else. For me this happens regularly when i just pull a disk from a different server and import the pool, forgetting -R. Now you have two different pools with a / dataset. It also happens when you zfs recv a backup of some other / to inspect it. Either way all of these cases involve accidentally forgetting a flag and then you have a broken system and cant recover without reboot. This gets even more worrying with the report that ubuntu does some zfs magic without involving the user, so they dont even have a way to trace it back to a specific zfs operation.

The second aspect is that it breaks userlands assumptions about what / is, not just systemd. Although systemd is of course the most aggressive in making assumptions about the system state, it isnt entirely wrong about it. There are no legitimate reasons to have two / after boot completed. It is extremely likely just an operator mistake. Processes that are already running will have a different / than newly started processes. So for example if you have an editor open with a file, it will write to the old / while an editor opened after the fact will not see those changes. A system daemon writing state, will get confused why its clients see a different state. Technically these are supposed to pass fds to avoid this, but the reality is that most software just doesn't consider this case and breaks in unexpected ways.

The rationale to have a dmesg warning is that having two / doesnt happen in normal operation, and the user should be assisted in finding the root cause of other software (not zfs) failing.

aep avatar Jun 06 '25 06:06 aep

I'm not sure how this happened to my system, but I had these changes around the same time (likely kernel 6.8.0-596.8.0-60 and 6 and 2.2.2-0ubuntu9.12.2.2-0ubuntu9.2), and one of them hosed my system so GRUB couldn't find anything to boot. I've fixed it now, but in case it helps these are the duplicate mount points I ended up with. Remove the ones for rpool and bpool fixed things.

rpool                     mountpoint = /
rpool/ROOT               mountpoint = none
rpool/ROOT/ubuntu_d6m9x5 mountpoint = /

bpool                     mountpoint = /boot
bpool/BOOT               mountpoint = none 
bpool/BOOT/ubuntu_d6m9x5 mountpoint = /boot

alexrudd2 avatar Jun 24 '25 01:06 alexrudd2

@alexrudd2 You will find that the first mention of the / mountpoint in the pool root datasets will be set to canmount=off, while the second appearance further down in the dataset hierarchy will have canmount=on. This is the default Ubuntu file system layout. Watch out for canmount on https://github.com/canonical/subiquity/blob/51b5678f128847fc8ea12ae2b63780ec8df3694a/examples/ai-zfs-efi.yaml#L41-L81

I've deleted bpool since installing this system to make way for ZBM, but the remaining pool shows the same:

$ zfs list -o name,canmount,mountpoint rpool rpool/ROOT rpool/ROOT/ubuntu_d4psvq 
NAME                      CANMOUNT  MOUNTPOINT
rpool                     off       /
rpool/ROOT                off       none
rpool/ROOT/ubuntu_d4psvq  on        /

almereyda avatar Oct 27 '25 22:10 almereyda