make it harder to overmount /
there's an ongoing issue in systemd https://github.com/systemd/systemd/issues/29559 that indicates lots of people accidentally mount two datasets on /. This is difficult to detect on linux because the kernel doesnt really care, but lots of userspace doesnt work anymore.
The optimal solution would be to refuse mounting a second / , but thats likely not a very popular idea. The second best thing would be to at least throw a big fat warning into dmesg so a sysadmin can at least detect the underlying problem more easily.
The second thing is trivial to implement. I can send a patch if there's consensus.
I have some sympathy, and I'm not a fan of finger-pointing, but I really don't see how this is anything other than a systemd problem. systemd/systemd#37086 seems to indicate that it's a problem even when OpenZFS isn't in the mix. So I'm not super keen on working around their bug, at least until they've conclude that they won't fix it.
However, if there's more of an at-a-distance issue, say, OpenZFS itself mounted two datasets on top of each other and the user expressly did want or expect that, then I'm more interested in that. The systemd report is very long and I don't know all the moving parts involved, so a description and action focused on OpenZFS would be appreciated in this case.
The systemd report is very long and I don't know all the moving parts involved, so a description and action focused on OpenZFS would be appreciated in this case
Thats fair, so my argument here is that the cause AND the effect are unexpected by most users, since they're unaware of the technicalities of VFS.
I can't speculate too much about how it happens for everyone else. For me this happens regularly when i just pull a disk from a different server and import the pool, forgetting -R. Now you have two different pools with a / dataset. It also happens when you zfs recv a backup of some other / to inspect it. Either way all of these cases involve accidentally forgetting a flag and then you have a broken system and cant recover without reboot. This gets even more worrying with the report that ubuntu does some zfs magic without involving the user, so they dont even have a way to trace it back to a specific zfs operation.
The second aspect is that it breaks userlands assumptions about what / is, not just systemd. Although systemd is of course the most aggressive in making assumptions about the system state, it isnt entirely wrong about it. There are no legitimate reasons to have two / after boot completed. It is extremely likely just an operator mistake. Processes that are already running will have a different / than newly started processes. So for example if you have an editor open with a file, it will write to the old / while an editor opened after the fact will not see those changes. A system daemon writing state, will get confused why its clients see a different state. Technically these are supposed to pass fds to avoid this, but the reality is that most software just doesn't consider this case and breaks in unexpected ways.
The rationale to have a dmesg warning is that having two / doesnt happen in normal operation, and the user should be assisted in finding the root cause of other software (not zfs) failing.
I'm not sure how this happened to my system, but I had these changes around the same time (likely kernel 6.8.0-59 ➝ 6.8.0-60 and 6 and 2.2.2-0ubuntu9.1 ➝ 2.2.2-0ubuntu9.2), and one of them hosed my system so GRUB couldn't find anything to boot. I've fixed it now, but in case it helps these are the duplicate mount points I ended up with. Remove the ones for rpool and bpool fixed things.
rpool mountpoint = /
rpool/ROOT mountpoint = none
rpool/ROOT/ubuntu_d6m9x5 mountpoint = /
bpool mountpoint = /boot
bpool/BOOT mountpoint = none
bpool/BOOT/ubuntu_d6m9x5 mountpoint = /boot
@alexrudd2 You will find that the first mention of the / mountpoint in the pool root datasets will be set to canmount=off, while the second appearance further down in the dataset hierarchy will have canmount=on. This is the default Ubuntu file system layout. Watch out for canmount on https://github.com/canonical/subiquity/blob/51b5678f128847fc8ea12ae2b63780ec8df3694a/examples/ai-zfs-efi.yaml#L41-L81
I've deleted bpool since installing this system to make way for ZBM, but the remaining pool shows the same:
$ zfs list -o name,canmount,mountpoint rpool rpool/ROOT rpool/ROOT/ubuntu_d4psvq
NAME CANMOUNT MOUNTPOINT
rpool off /
rpool/ROOT off none
rpool/ROOT/ubuntu_d4psvq on /