incus icon indicating copy to clipboard operation
incus copied to clipboard

ZFS block mode devices sometimes don't appear in time (or disappear)

Open mschiff opened this issue 1 year ago • 4 comments

Required information

  • Distribution: Gentoo
  • Distribution version: --
  • The output of "incus info":
config: {}
api_extensions:
[…]
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
auth_user_name: root
auth_user_method: unix
environment:
  addresses: []
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    MIIB+zCCAYGgAwIBAgIRAICBOYWaBvVjqdDlc8YUw/swCgYIKoZIzj0EAwMwMDEZ
    MBcGA1UEChMQTGludXggQ29udGFpbmVyczETMBEGA1UEAwwKcm9vdEBwbHV0bzAe
    Fw0yNDAxMTIxNTUzNTNaFw0zNDAxMDkxNTUzNTNaMDAxGTAXBgNVBAoTEExpbnV4
    IENvbnRhaW5lcnMxEzARBgNVBAMMCnJvb3RAcGx1dG8wdjAQBgcqhkjOPQIBBgUr
    gQQAIgNiAASHXIMZRZz07Fvdh1ehgSKNc3yk/fYbTPMGbOF3VyeDWVWEsjhHQFeW
    rJ9EPYNjogBUba3NKVaEmMAz+nKDM4TcwwLytIUAFLFyIqaRnuxFHF6RXU/g7jGV
    RzwqYxSzCcujXzBdMA4GA1UdDwEB/wQEAwIFoDATBgNVHSUEDDAKBggrBgEFBQcD
    ATAMBgNVHRMBAf8EAjAAMCgGA1UdEQQhMB+CBXBsdXRvhwR/AAABhxAAAAAAAAAA
    AAAAAAAAAAABMAoGCCqGSM49BAMDA2gAMGUCMQD9KzSklLuzuifRoJvKuwCPmn9r
    UyfCac0hEA5Kj57/hgJ0y/+n3ODpln3JuAdn95MCMBJ8b5/W6x6sxC2e+vso+lfD
    LNTgFQ6Ha6FfWEKBTGPb0KPr3XICf5IMIWb6kI3OiA==
    -----END CERTIFICATE-----
  certificate_fingerprint: a75220cf363eb0c0ffd6427f5d9d88510d5d608d335514b76548b47e37bb4e69
  driver: lxc | qemu
  driver_version: 6.0.2 | 8.2.3
  firewall: xtables
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    idmapped_mounts: "true"
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    seccomp_listener_continue: "true"
    uevent_injection: "true"
    unpriv_binfmt: "false"
    unpriv_fscaps: "true"
  kernel_version: 6.6.62-gentoo-dist
  lxc_features:
    cgroup2: "true"
    core_scheduling: "true"
    devpts_fd: "true"
    idmapped_mounts_v2: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    pidfd: "true"
    seccomp_allow_deny_syntax: "true"
    seccomp_notify: "true"
    seccomp_proxy_send_notify_fd: "true"
  os_name: Gentoo
  os_version: "2.17"
  project: default
  server: incus
  server_clustered: false
  server_event_mode: full-mesh
  server_name: pluto
  server_pid: 1857954
  server_version: "6.7"
  storage: zfs
  storage_version: 2.2.5-r0-gentoo
  storage_supported_drivers:
  - name: zfs
    version: 2.2.5-r0-gentoo
    remote: false
  - name: dir
    version: "1"
    remote: false

Issue description

Here is another one on the same setup as in #1483

While running incus-benchmark launch --count 200 --parallel 11 (with --parallel >= 11 on that 12 thread machine), some instances fail with this errors:

(sysadm_r)@pluto ~ # incus-benchmark launch --count 200 --parallel 11
Test environment:
  Server backend: incus
  Server version: 6.7
  Kernel: Linux
  Kernel tecture: x86_64
  Kernel version: 6.6.62-gentoo-dist
  Storage backend: zfs
  Storage version: 2.2.5-r0-gentoo
  Container backend: lxc | qemu
  Container version: 6.0.2 | 8.2.3

Test variables:
  Container count: 200
  Container mode: unprivileged
  Startup mode: normal startup
  Image: images:ubuntu/22.04
  Batches: 18
  Batch size: 11
  Remainder: 2

[Dec 13 09:37:00.962] Found image in local store: 3bf7665e786f9baf780a5774fae5b43f5a29c79d76af1474ff6e8f212fcfa25d
[Dec 13 09:37:00.962] Batch processing start
[Dec 13 09:37:58.761] Processed 11 containers in 57.798s (0.190/s)
[Dec 13 09:38:54.180] Processed 22 containers in 113.217s (0.194/s)
[Dec 13 09:40:51.015] Processed 44 containers in 230.053s (0.191/s)
[Dec 13 09:41:25.084] Failed to launch container 'benchmark-052': Failed creating instance from image: Failed to activate volume: Failed to locate zvol for "zdata/incus/containers/benchmark-052": context deadline exceeded
[Dec 13 09:41:25.099] Failed to launch container 'benchmark-055': Failed creating instance from image: Failed to activate volume: Failed to locate zvol for "zdata/incus/containers/benchmark-055": context deadline exceeded
[Dec 13 09:44:18.373] Processed 88 containers in 437.411s (0.201/s)
[Dec 13 09:44:53.504] Failed to launch container 'benchmark-093': Failed creating instance from image: Failed to activate volume: Failed to locate zvol for "zdata/incus/containers/benchmark-093": context deadline exceeded
[Dec 13 09:45:43.469] Failed to launch container 'benchmark-107': Failed creating instance from image: Failed to mount "/dev/zvol/zdata/incus/containers/benchmark-107" on "/var/lib/incus/storage-pools/default/containers/benchmark-107" using "ext4": no such file or directory
[Dec 13 09:51:44.892] Processed 176 containers in 883.929s (0.199/s)
[Dec 13 09:53:45.504] Batch processing completed in 1004.542s
(sysadm_r)@pluto ~ #

indeed 4 instances missing:

(sysadm_r)@pluto ~ # incus list|grep benchmark| wc -l
196
(sysadm_r)@pluto ~ #

The ZFS storage itself is fine and healthy. The mentioned filesets are missing in the pool. If I increase --parallel further, I get more failing instances (both, "context deadline exceeded" and "no such file or directory")

Steps to reproduce

  1. Have zfs storage backend (here 2x nvme as system and zfs special device and 2x spinning rust for data)
  2. run above incus-benchmark command with --parallel matching (alomost) system cpu threads

Information to attach

I also see following errors, but also for instances that did not fail in the end:

Dez 13 09:43:25 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-086: Can't open blockdev
Dez 13 09:43:25 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-086: Can't lookup blockdev
Dez 13 09:43:26 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-086: Can't lookup blockdev
Dez 13 09:43:26 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-086: Can't lookup blockdev
Dez 13 09:43:27 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-086: Can't lookup blockdev
Dez 13 09:45:08 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't open blockdev
Dez 13 09:45:09 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:09 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:10 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:10 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:11 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:12 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:12 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:13 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:13 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:14 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:14 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:15 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:15 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:16 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:16 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:17 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:17 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:18 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:18 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-107: Can't lookup blockdev
Dez 13 09:45:36 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-110: Can't open blockdev
Dez 13 09:52:19 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-183: Can't open blockdev
Dez 13 09:52:34 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-195: Can't open blockdev
Dez 13 09:52:34 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-198: Can't open blockdev
Dez 13 09:52:34 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-195: Can't lookup blockdev
Dez 13 09:52:35 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-198: Can't lookup blockdev
Dez 13 09:52:35 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-195: Can't lookup blockdev
Dez 13 09:52:35 pluto kernel: /dev/zvol/zdata/incus/containers/benchmark-198: Can't lookup blockdev

To me this looks like load issues and incusd not waiting long enough for devices to appear? Because it eventually gave up on 107 (does not exist), but not for example on 086 (exists and is running)

Here is some of the the output of incus monitor --pretty, but from another run.

benchmark-055 failed with "no such file or directory":

(sysadm_r)@pluto ~ # grep benchmark-055 incus-monitor.log 
time="2024-12-13T10:35:54+01:00" level=info msg="ID: 7e31f71c-9194-40da-aaa8-eca2192195c2, Class: task, Description: Creating instance" CreatedAt="2024-12-13 10:35:54.267475495 +0100 CET" Err= Location=none MayCancel=false Metadata="map[]" Resources="map[instances:[/1.0/instances/benchmark-055]]" Status=Pending StatusCode=Pending UpdatedAt="2024-12-13 10:35:54.267475495 +0100 CET"
time="2024-12-13T10:35:54+01:00" level=info msg="ID: 7e31f71c-9194-40da-aaa8-eca2192195c2, Class: task, Description: Creating instance" CreatedAt="2024-12-13 10:35:54.267475495 +0100 CET" Err= Location=none MayCancel=false Metadata="map[]" Resources="map[instances:[/1.0/instances/benchmark-055]]" Status=Running StatusCode=Running UpdatedAt="2024-12-13 10:35:54.267475495 +0100 CET"
time="2024-12-13T10:35:54+01:00" level=debug msg="Instance operation lock created" action=create instance=benchmark-055 project=default reusable=false
time="2024-12-13T10:35:54+01:00" level=info msg="Creating instance" ephemeral=false instance=benchmark-055 instanceType=container project=default
time="2024-12-13T10:35:54+01:00" level=debug msg="Adding device" device=eth0 instance=benchmark-055 instanceType=container project=default type=nic
time="2024-12-13T10:35:54+01:00" level=info msg="Action: instance-created, Source: /1.0/instances/benchmark-055, Requestor: unix/root (@)" location=none storage-pool=default type=container
time="2024-12-13T10:35:54+01:00" level=info msg="Created instance" ephemeral=false instance=benchmark-055 instanceType=container project=default
time="2024-12-13T10:35:54+01:00" level=debug msg="Adding device" device=root instance=benchmark-055 instanceType=container project=default type=disk
time="2024-12-13T10:35:54+01:00" level=debug msg="CreateInstanceFromImage started" driver=zfs instance=benchmark-055 pool=default project=default
time="2024-12-13T10:35:56+01:00" level=debug msg="Activated ZFS volume" dev=zdata/incus/containers/benchmark-055 driver=zfs pool=default volName=benchmark-055
time="2024-12-13T10:36:14+01:00" level=debug msg="Deactivated ZFS volume" dev=zdata/incus/containers/benchmark-055 driver=zfs pool=default volName=benchmark-055
time="2024-12-13T10:36:15+01:00" level=debug msg="Removing device" device=root instance=benchmark-055 instanceType=container project=default type=disk
time="2024-12-13T10:36:15+01:00" level=debug msg="CreateInstanceFromImage finished" driver=zfs instance=benchmark-055 pool=default project=default
time="2024-12-13T10:36:15+01:00" level=debug msg="Instance operation lock finished" action=create err="<nil>" instance=benchmark-055 project=default reusable=false
time="2024-12-13T10:36:15+01:00" level=debug msg="Removing device" device=eth0 instance=benchmark-055 instanceType=container project=default type=nic
time="2024-12-13T10:36:15+01:00" level=info msg="ID: 7e31f71c-9194-40da-aaa8-eca2192195c2, Class: task, Description: Creating instance" CreatedAt="2024-12-13 10:35:54.267475495 +0100 CET" Err="Failed creating instance from image: Failed to mount \"/dev/zvol/zdata/incus/containers/benchmark-055\" on \"/var/lib/incus/storage-pools/default/containers/benchmark-055\" using \"ext4\": no such file or directory" Location=none MayCancel=false Metadata="map[]" Resources="map[instances:[/1.0/instances/benchmark-055]]" Status=Failure StatusCode=Failure UpdatedAt="2024-12-13 10:35:54.267475495 +0100 CET"
time="2024-12-13T10:36:15+01:00" level=debug msg="Failure for operation" class=task description="Creating instance" err="Failed creating instance from image: Failed to mount \"/dev/zvol/zdata/incus/containers/benchmark-055\" on \"/var/lib/incus/storage-pools/default/containers/benchmark-055\" using \"ext4\": no such file or directory" operation=7e31f71c-9194-40da-aaa8-eca2192195c2 project=default
(sysadm_r)@pluto ~ # 

benchmark-082 failed with context deadline exceeded

(sysadm_r)@pluto ~ # grep benchmark-082 incus-monitor .log
time="2024-12-13T10:38:30+01:00" level=info msg="ID: 69bfdd5e-d71c-4503-a18d-56fac3a0f59c, Class: task, Description: Creating instance" CreatedAt="2024-12-13 10:38:30.739073919 +0100 CET" Err= Location=none MayCancel=false Metadata="map[]" Resources="map[instances:[/1.0/instances/benchmark-082]]" Status=Pending StatusCode=Pending UpdatedAt="2024-12-13 10:38:30.739073919 +0100 CET"
time="2024-12-13T10:38:30+01:00" level=info msg="ID: 69bfdd5e-d71c-4503-a18d-56fac3a0f59c, Class: task, Description: Creating instance" CreatedAt="2024-12-13 10:38:30.739073919 +0100 CET" Err= Location=none MayCancel=false Metadata="map[]" Resources="map[instances:[/1.0/instances/benchmark-082]]" Status=Running StatusCode=Running UpdatedAt="2024-12-13 10:38:30.739073919 +0100 CET"
time="2024-12-13T10:38:30+01:00" level=debug msg="Instance operation lock created" action=create instance=benchmark-082 project=default reusable=false
time="2024-12-13T10:38:30+01:00" level=info msg="Creating instance" ephemeral=false instance=benchmark-082 instanceType=container project=default
time="2024-12-13T10:38:31+01:00" level=debug msg="Adding device" device=eth0 instance=benchmark-082 instanceType=container project=default type=nic
time="2024-12-13T10:38:31+01:00" level=debug msg="Adding device" device=root instance=benchmark-082 instanceType=container project=default type=disk
time="2024-12-13T10:38:31+01:00" level=info msg="Action: instance-created, Source: /1.0/instances/benchmark-082, Requestor: unix/root (@)" location=none storage-pool=default type=container
time="2024-12-13T10:38:31+01:00" level=info msg="Created instance" ephemeral=false instance=benchmark-082 instanceType=container project=default
time="2024-12-13T10:38:31+01:00" level=debug msg="CreateInstanceFromImage started" driver=zfs instance=benchmark-082 pool=default project=default
time="2024-12-13T10:39:11+01:00" level=debug msg="Removing device" device=eth0 instance=benchmark-082 instanceType=container project=default type=nic
time="2024-12-13T10:39:11+01:00" level=debug msg="CreateInstanceFromImage finished" driver=zfs instance=benchmark-082 pool=default project=default
time="2024-12-13T10:39:11+01:00" level=debug msg="Instance operation lock finished" action=create err="<nil>" instance=benchmark-082 project=default reusable=false
time="2024-12-13T10:39:11+01:00" level=debug msg="Removing device" device=root instance=benchmark-082 instanceType=container project=default type=disk
time="2024-12-13T10:39:11+01:00" level=info msg="ID: 69bfdd5e-d71c-4503-a18d-56fac3a0f59c, Class: task, Description: Creating instance" CreatedAt="2024-12-13 10:38:30.739073919 +0100 CET" Err="Failed creating instance from image: Failed to activate volume: Failed to locate zvol for \"zdata/incus/containers/benchmark-082\": context deadline exceeded" Location=none MayCancel=false Metadata="map[]" Resources="map[instances:[/1.0/instances/benchmark-082]]" Status=Failure StatusCode=Failure UpdatedAt="2024-12-13 10:38:30.739073919 +0100 CET"
time="2024-12-13T10:39:11+01:00" level=debug msg="Failure for operation" class=task description="Creating instance" err="Failed creating instance from image: Failed to activate volume: Failed to locate zvol for \"zdata/incus/containers/benchmark-082\": context deadline exceeded" operation=69bfdd5e-d71c-4503-a18d-56fac3a0f59c project=default
(sysadm_r)@pluto ~ # 

mschiff avatar Dec 13 '24 09:12 mschiff

Is your system using udev? Do you also have /lib/udev/zvol_id or zvol_id available in the $PATH?

Normally Incus relies on udev to automatically create the needed /dev/zvol entries. When that doesn't happen, it can also use zvol_id as a fallback mechanism to try to figure out the right device.

There is a 30s timeout for that logic at which point you get the Failed to locate zvol for error. The other cases would generally imply that we did locate a /dev/zvol device but that it was somehow gone by the time we tried to mount it, or maybe was only partially set up by that point.

I'm running your example on a system also using ZFS block mode here but so far without any problem.

stgraber avatar Dec 13 '24 16:12 stgraber

Is your system using udev?

yes

Do you also have /lib/udev/zvol_id or zvol_id available in the $PATH?

yes

Normally Incus relies on udev to automatically create the needed /dev/zvol entries. When that doesn't happen, it can also use zvol_id as a fallback mechanism to try to figure out the right device.

There is a 30s timeout for that logic at which point you get the Failed to locate zvol for error. The other cases would generally imply that we did locate a /dev/zvol device but that it was somehow gone by the time we tried to mount it, or maybe was only partially set up by that point.

I'm running your example on a system also using ZFS block mode here but so far without any problem.

How much system load do you get? I am having a very high IOwait when it happens. Is there any way to rais this 30s timeout just to test if waiting longer would help here?

mschiff avatar Dec 13 '24 21:12 mschiff

The timeout is in the activateVolume function in driver_zfs_volumes.go

		// Wait up to 30 seconds for the device to appear.
		ctx, cancel := context.WithTimeout(d.state.ShutdownCtx, 30*time.Second)
		defer cancel()

mrstux avatar Feb 27 '25 05:02 mrstux

FWIW, I've noticed this getting in the way of VM creation when getting past the 2000-3000 VMs.

I'll need to do a bit more testing to see if there's really something we can do as 30s is already quite a long time to wait for an operation we'd like to see happen within a few ms.

stgraber avatar Nov 09 '25 07:11 stgraber