disko icon indicating copy to clipboard operation
disko copied to clipboard

random failures during script "cannot open /dev/disk/by-partlabel/disk-sda-boot"

Open ghostbuster91 opened this issue 1 year ago • 2 comments

Following disko script:

{ disks ? [ "/dev/sda" ], ... }: {
  disko.devices = {
    disk = {
      sda = {
        type = "disk";
        device = builtins.elemAt disks 0;
        content = {
          type = "gpt";
          partitions = {
            grub = {
              size = "1M";
              type = "EF02";
              priority = 1;
            };
            boot = {
              size = "512M";
              content = {
                type = "filesystem";
                format = "vfat";
                mountpoint = "/boot";
              };
              priority = 2;
              hybrid.mbrBootableFlag = true;
            };
            root = {
              size = "128G";
              content = {
                type = "zfs";
                pool = "rpool1";
              };
              priority = 4;
            };
          };
        };
      };
    };
    zpool = {
      rpool1 =
        let
          unmountable = { type = "zfs_fs"; };
          filesystem = mountpoint: {
            type = "zfs_fs";
            options = {
              canmount = "noauto";
              inherit mountpoint;
            };
            inherit mountpoint;
          };
        in
        {
          type = "zpool";

          rootFsOptions = {
            compression = "lz4";
            "com.sun:auto-snapshot" = "false";
            canmount = "off";
            xattr = "sa";
            atime = "off";
          };
          options = {
            ashift = "12";
            autotrim = "on";
            compatibility = "grub2";
          };
          datasets = {
            "local" = unmountable;
            "local/root" = filesystem "/" // {
              postCreateHook = "zfs snapshot rpool1/local/root@blank";
            };
            "local/nix" = filesystem "/nix";
            "local/state" = filesystem "/state";

            "safe" = unmountable;
            "safe/persist" = filesystem "/persist";
          };
        };
    };
  };
}

sometimes (but very often) fails with mkfs.vfat cannot open /dev/disk/by-partlabel/disk-sda-boot (or other similar error). This happens when the disko format script refers to a disk by by-partlabel lookup (so also when calling zpool create or mkswap etc). What is weird is that these entries exist (verified by manually inspecting these directories with ls)

Manually calling the failed command and restarting disko script eventually passes.

This only happens on a real machine never on virtual one. Originally reported in #735

ghostbuster91 avatar Aug 19 '24 08:08 ghostbuster91

maybe some timing issues, can you post the scripts output on such a failed attempt? maybe sleeping a bit before the mkfs.vat could help. you can try adding a preCreateHook = "sleep 3"; next to the type = "filesystem";

Lassulus avatar Aug 19 '24 08:08 Lassulus

maybe some timing issues

I think so too.

can try adding a preCreateHook = "sleep 3"; next to the type = "filesystem";

This looks very promising, thanks :)

can you post the scripts output on such a failed attempt?

I will reproduce the issue as I need to test the preCreateHook suggestion anyway and then I will post the whole output.

ghostbuster91 avatar Aug 22 '24 19:08 ghostbuster91

This happened to me, when disk-sda-boot would be a vfat-formatted ESP partition named disk-sda-ESP, which exceeds the 11 character limit for that file system.

  • https://github.com/nix-community/disko/issues/389

The partlabel would be set correctly on the partition and could be read by cfdisk, parted and the likes, but blkid or lsblk would not recognise them and as such the installer fail.

The documentation could contain a word about that.

This may be related to how VFAT works in the Kernel or how certain systems omit it from their udev rules, but I can't be shure. Observing /dev and /dev/md* showed that the py-partlabel symlink for the ESP partition would be present in the system up to the point of invoking mkfs.vfat, from when on it disappeared.

Is this maybe another issue to raise independently?

almereyda avatar Sep 07 '25 09:09 almereyda