grub-btrfs icon indicating copy to clipboard operation
grub-btrfs copied to clipboard

grub-btrfs-overlayfs hook fails if the systemd hooks are used (instead of udev)

Open JordanViknar opened this issue 3 years ago • 18 comments

Hello, first, I should probably state what my configuration is :

  • I'm using an eMMC storage device with BTRFS on the root, and /boot on a separate FAT32 partition (because I use EFISTUB and GRUB as backup), along with a SWAP partition and zRAM configured with zram-generator to be the same size as the RAM (3.7GB).
  • I'm using Arch Linux.
  • I'm using Snapper, Snap-pac, Snapper-GUI (from AUR), Snapper-support (from AUR) and, of course, grub-btrfs.
  • The configs are the same as the default ones, except access to /.snapshots is allowed to all users (following the instructions from the Arch Wiki), and access to the default root config's settings is allowed to my user (though root is still the proprietary).

When trying to boot to one of the snapshots from the GRUB menu, GDM doesn't start because the filesystem is still locked in read-only mode, despite the proper grub-btrfs-overlayfs hook being used in mkinitcpio.conf. It's as if the hook wasn't there at all, which would be the expected behavior if /boot was part of the BTRFS partition... except it isn't.

Here's the contents of my mkinitcpio.conf : MODULES=(i915) HOOKS="base systemd sd-plymouth autodetect sd-vconsole modconf block filesystems grub-btrfs-overlayfs" COMPRESSION="lz4" COMPRESSION_OPTIONS="-9"

Could I please have some help for solving this issue ? I suspect the problem is related to the systemd hook, or the storage being an eMMC, or perhaps the zRAM's size.

JordanViknar avatar Feb 07 '22 12:02 JordanViknar

Forgot to mention, but (ignoring the snapshot subvolumes), the only subvolume on that BTRFS system is the root itself.

JordanViknar avatar Feb 07 '22 12:02 JordanViknar

just a quick question, did you rebuild the init after adding the hook grub-btrfs-overlayfs?

northfacts avatar Feb 07 '22 12:02 northfacts

just a quick question, did you rebuild the init after adding the hook grub-btrfs-overlayfs?

Yes, of course.

JordanViknar avatar Feb 07 '22 13:02 JordanViknar

I found out the issue also happens on my other laptop, which is Manjaro based, and relies on an Intel RST/Optane RAID accidentally disguised as a normal drive (no idea why it works, but it does, when I should normally be using mdadm I suppose).

It also has the systemd hook, and the zRAM is also the size of the entire RAM.

JordanViknar avatar Feb 07 '22 17:02 JordanViknar

I think the issue is related to the systemd hook. That would make more sense than zRAM causing the issue, since zram-generator isn't part of the initramfs. I'll try again with the udev hooks instead of the systemd hooks.

JordanViknar avatar Feb 07 '22 17:02 JordanViknar

I can now confirm the issue is related to the systemd hook. I set up the udev hooks on my main laptop, and that fixed the problem. Unfortunately, I prefer using the systemd hooks, so I can't close this issue or mark it as solved. I'm going to change the title to reflect the problem, now that I know what's causing it.

JordanViknar avatar Feb 08 '22 12:02 JordanViknar

For more details about those systemd hooks, check this page from the Arch Wiki : https://wiki.archlinux.org/title/Mkinitcpio#Common_hooks

JordanViknar avatar Feb 08 '22 12:02 JordanViknar

Additionally, I noticed some weird behavior unrelated to this issue : all of my snapshots manage to boot with the systemd hook if GDM is disabled (via kernel parameter) despite being read-only, but some (especially older ones) don't with the udev hook despite actually being read-write with the overlay, I suppose because of some kind of kernel mismatch.

JordanViknar avatar Feb 08 '22 12:02 JordanViknar

Quote from the Arch Wiki, which explains why using the systemd hook causes this :

Runtime hooks are only used by busybox init. systemd hook triggers a systemd based init, which does not run any runtime hooks but uses systemd units instead.

JordanViknar avatar Feb 08 '22 12:02 JordanViknar

Current workaround I've been using for now : I'm booting on snapshots using a separate initramfs preset that relies on the udev hook instead of the systemd hook. It is not a perfect solution though : I think the proper fix would be to implement a proper systemd unit, along with a separate sd-grub-btrfs-overlayfs hook like many programs (such as Plymouth) do out there.

JordanViknar avatar Feb 09 '22 07:02 JordanViknar

Similar problem. Are you expecting any progress on a solution?

Labaman avatar Aug 25 '22 18:08 Labaman

@JordanViknar Have you heard any updates on this?

@Antynea is this the same issue I'm experiencing here?

I don't know what udev has to do with any of this or why, if udev DOES have something to do with it, that isn't stated anywhere in the arch wiki docs, or the readme here, or in any online guide I've read/watched. I'm super confused and it's really frustrating. Is this just totally random? It's happening on 2 separate machines (bare metal) and a VM on each machine.

mapleroyal avatar Apr 06 '23 18:04 mapleroyal

No, it isn't.

Antynea avatar Apr 06 '23 18:04 Antynea

I can confirm that systemd hook is causing this issue. Switch to udev in HOOKS to fix the issue.

But KDE SDDM allows you booting into read-only snapshot without overlayfs. Other display managers GDM and LightDM do not.

Zesko avatar Apr 19 '23 12:04 Zesko

Hi, thank you for posting the issue @JordanViknar, I encoutered the same problem recently while reinstalling my Arch setup :grinning: I added a warning in ArchWiki to inform users about it: https://wiki.archlinux.org/index.php?title=Snapper&diff=prev&oldid=803868

flobsh avatar Mar 18 '24 11:03 flobsh

There is an alternative to mkinitcpio: dracut which should work with systemd module + grub-btrfs-overlayfs.

Two script files for Dracut must be created manually for this function:

  1. Create a file: /usr/lib/dracut/modules.d/91btrfs-snapshot-overlay/module-setup.sh
#!/usr/bin/bash

# called by dracut
check() {
    dracut_module_included btrfs || return 1
    return 0
}

# called by dracut
depends() {
    return 0
}

# called by dracut
install() {
    inst mktemp
    hostonly='' instmods overlay
    inst_hook pre-pivot 000 "$moddir/snapshot-overlay.sh"
}
  1. Create a file: /usr/lib/dracut/modules.d/91btrfs-snapshot-overlay/snapshot-overlay.sh
#!/usr/bin/bash
function mount_snapshot_overlay() {
    local root_mnt="$NEWROOT"
    if [[ "$(findmnt --mountpoint "$root_mnt" -o FSTYPE -n)" = "btrfs" ]] && [[ "$(btrfs property get ${root_mnt} ro)" != "ro=false" ]]; then
        local ram_dir=$(mktemp -d -p /)
        mount -t tmpfs cowspace ${ram_dir}
        mkdir -p ${ram_dir}/{upper,work}
        mount -t overlay -o lowerdir=${root_mnt},upperdir=${ram_dir}/upper,workdir=${ram_dir}/work rootfs ${root_mnt}
    fi
}

mount_snapshot_overlay

Zesko avatar Mar 18 '24 12:03 Zesko

I was able to get this working without using dracut (and still using systemd).

It basically involves creating a systemd service goes in the initrd, and messes with sysroot. This might cause other issues with how your system is setup, it did cause other service failures with mine but I was able to boot into my laptop fine.

This file is the script that gets run in the initrd:

/usr/local/bin/overlayfs-setup

#!/bin/bash

root_mnt="/sysroot"
current_dev=$(findmnt -n -o SOURCE /sysroot | sed 's@\[/.*@@g')

# Checking if /sysroot is a Btrfs filesystem and mounted read-only
#if findmnt -n -o FSTYPE /sysroot | grep -q 'btrfs' && findmnt -n -o OPTIONS /sysroot | grep -q 'ro,'; then
if [[ $(blkid "${current_dev}" -s TYPE -o value) = "btrfs" ]] && [[ $(btrfs property get ${root_mnt} ro) != "ro=false" ]]; then
    # Setting up directories for overlay
    echo "1"
    mkdir -p /mnt/overlay
    echo "2"
    mount -t tmpfs tmpfs /mnt/overlay
    echo "3"
    mkdir -p /mnt/overlay/upper
    wcho "4"
    mkdir -p /mnt/overlay/work
    echo "5"

    # Mounting overlay
    mount -t overlay overlay -o lowerdir=/sysroot,upperdir=/mnt/overlay/upper,workdir=/mnt/overlay/work /sysroot
    echo "OverlayFS mounted on /sysroot."
else
    echo "/sysroot is not a read-only Btrfs filesystem."
fi

You also have to create a service, not in /etc (maybe it would work if you put it there: I didn't try):

/usr/lib/systemd/system/overlayfs-setup.service

[Unit]
Description=Setup OverlayFS on Root Filesystem
DefaultDependencies=no
After=initrd-fs.target
Before=initrd.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/overlayfs-setup
RemainAfterExit=yes

[Install]
WantedBy=initrd.target

You can't enable this normally, you have to use:

ln -s /usr/lib/systemd/system/overlayfs-setup.service /usr/lib/systemd/system/initrd.target.wants/overlayfs-setup.service

for some reason. I think if you enable it the other way the symlink is not included in initramfs.

You need to create the hooks: /etc/initcpio/hooks/sd-overlayfs

#!/usr/bin/ash
run_hook() {
}

/etc/initcpio/install/sd-overlayfs


build() {
    add_module btrfs
    add_module overlay
    add_binary btrfs
    add_binary btrfsck
    add_binary blkid
    add_binary findmnt
    add_binary bash
    add_systemd_unit overlayfs-setup.service
    add_runscript
}

Finally, you can add it to your hooks:

HOOKS=(base systemd btrfs autodetect microcode modconf kms keyboard keymap sd-vconsole block sd-encrypt filesystems fsck sd-overlayfs plymouth)

This is what I use, I'm not sure if the ordering of sd-overlayfs matters though.

If I have time I might try to get these fixes upstreamed, but they might be difficult to package.

otisdog8 avatar Mar 27 '24 00:03 otisdog8