cloud-init icon indicating copy to clipboard operation
cloud-init copied to clipboard

sr0 not available at generator timeframe causes cloud-init.target not run

Open ubuntu-server-builder opened this issue 2 years ago • 10 comments

This bug was originally filed in Launchpad as LP: #1940791

Launchpad details
affected_projects = ['cloud-images']
assignee = None
assignee_name = None
date_closed = None
date_created = 2021-08-23T03:02:20.779310+00:00
date_fix_committed = None
date_fix_released = None
id = 1940791
importance = undecided
is_complete = False
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1940791
milestone = None
owner = esj
owner_name = Éric St-Jean
private = False
status = triaged
submitter = achasen
submitter_name = Adam Chasen
tags = []
duplicates = [1961832]

Launchpad user Adam Chasen(achasen) wrote on 2021-08-23T03:02:20.779310+00:00

Focal image cloud-init generator reports: 'cloud-init is enabled but no datasource found, disabling'

looks to be related to ds-identify not finding the cdrom drive (and caching it) on first run. Not sure why /dev/sr0 would not be available early enough.

cat /run/cloud-init/ds-identify.log ... ISO9660_DEVS= ... No ds found [mode=search, notfound=disabled]. Disabled cloud-init [1] [up 1.20s] returning 1 root@ubuntu:~# /usr/lib/cloud-init/ds-identify --force [up 200.71s] ds-identify --force ... ISO9660_DEVS=/dev/sr0=cidata ... Found single datasource: NoCloud [up 200.79s] returning 0

Booting https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img as of Aug 22, 2021 in KVM (created with virt-install and libvirt) along with cloud-config ISO

$ cat /tmp/cloud #cloud-config hostname: proxy1 $ cloud-localds /tmp/test.iso /tmp/cloud

cloud-init.target never reached and network doesn't come up (default behavior for cloud-init is eth0 DHCP). If I manually start systemctl start cloud-init.target then I get what I expected, but by then it is "too late" and I also have to kick systemd-networkd.

cloud-init starts up as expected with the same environment when using Bionic (https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img)

The focal image never touches cloud-init.target. Note that there is no reverse dependency in focal.

root@ubuntu:~# systemctl list-dependencies --reverse cloud-init.target cloud-init.target

Both images have default target of "graphical.target"

There is mention of a "generator" and "detection" in the cloud-init docs. https://cloudinit.readthedocs.io/en/latest/topics/boot.html

The generator appears to be what is adding the "wants" of cloud-init.target to multi-user.target from /lib/systemd/system-generators/cloud-init-generator:     local target_name="multi-user.target" gen_d="$early_d"     local link_path="$gen_d/${target_name}.wants/${CLOUD_TARGET_NAME}"

Bionic: root@proxy1:~# systemctl get-default graphical.target root@proxy1:~# UNIT LOAD ACTIVE SUB DESCRIPTION basic.target loaded active active Basic System cloud-config.target loaded active active Cloud-config availability cloud-init.target loaded active active Cloud-init target ... root@proxy1:~# systemctl list-dependencies --reverse cloud-init.target cloud-init.target ● └─multi-user.target ● └─graphical.target root@proxy1:/etc/systemd/system# cat /run/cloud-init/cloud-init-generator.log /lib/systemd/system-generators/cloud-init-generator normal=/run/systemd/generator early=/run/systemd/generator.early late=/run/systemd/generator.late kernel command line (/proc/cmdline): BOOT_IMAGE=/boot/vmlinuz-4.15.0-154-generic root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0 kernel_cmdline found unset etc_file found unset default found enabled checking for datasource ds-identify rc=0 ds-identify _RET=found enabled via /run/systemd/generator.early/multi-user.target.wants/cloud-init.target -> /lib/systemd/system/cloud-init.target

Focal: root@ubuntu:~# systemctl get-default graphical.target root@ubuntu:~# systemctl list-units --type=target --all   UNIT LOAD ACTIVE SUB >   basic.target loaded active activ>   blockdev@dev-disk-by\x2dlabel-cloudimg\x2drootfs.target loaded inactive dead >   blockdev@dev-disk-by\x2dlabel-UEFI.target loaded inactive dead >   [email protected] loaded inactive dead >   [email protected] loaded inactive dead >   [email protected] loaded inactive dead >   [email protected] loaded inactive dead >   cloud-config.target loaded inactive dead >   cloud-init.target loaded inactive dead > root@ubuntu:~# systemctl list-unit-files ... cloud-config.service enabled enabled cloud-final.service enabled enabled cloud-init-local.service enabled enabled cloud-init.service enabled enabled ... root@ubuntu:~# systemctl list-dependencies --reverse cloud-init.target cloud-init.target root@ubuntu:~# systemctl list-dependencies cloud-init.target cloud-init.target ● ├─cloud-config.service ● ├─cloud-final.service ● ├─cloud-init-local.service ● └─cloud-init.service

root@ubuntu:~# cat /run/cloud-init/cloud-init-generator.log /usr/lib/systemd/system-generators/cloud-init-generator normal=/run/systemd/generator early=/run/systemd/generator.early late=/run/systemd/generator.late kernel command line (/proc/cmdline): BOOT_IMAGE=/boot/vmlinuz-5.4.0-1045-kvm root=PARTUUID=14530a28-f129-4b51-a64e-c64075fae7c7 ro console=tty1 console=ttyS0 panic=-1 kernel_cmdline found unset etc_file found unset default found enabled checking for datasource ds-identify rc=1 ds-identify _RET=notfound cloud-init is enabled but no datasource found, disabling already disabled: no change needed [no /run/systemd/generator.early/multi-user.target.wants/cloud-init.target]

Additional Resources: Possibly same issue https://bugzilla.redhat.com/show_bug.cgi?id=1820540

ubuntu-server-builder avatar May 12 '23 14:05 ubuntu-server-builder

Launchpad user John Chittum(jchittum) wrote on 2021-09-01T16:19:16.414551+00:00

Could you provide exact reproduction steps with virt-install and libvirt. I am attempting to reproduce locally with setups we normally use for testing, and am unable to:

  1. downloaded image from https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
  2. created a simple cloud-init yaml file:

#cloud-config password: <INSERT PASSWORD HERE> chpasswd: { expire: False } ssh_pwauth: True ssh_import_id: jchittum sudo: ALL=(ALL) NOPASSWD:ALL

  1. using cloud-localds from cloud-image-utils, made an ISO of the cloud-config cloud-localds cloud_init_with_pass.iso cloud-init.yaml

  2. used qemu to test the image:

qemu-system-x86_64
-cpu host -machine type=q35,accel=kvm -m 2048
-nographic
-snapshot
-netdev id=net00,type=user,hostfwd=tcp::2222-:22
-device virtio-net-pci,netdev=net00
-drive if=virtio,format=qcow2,file=focal-server-cloudimg-amd64-disk-kvm.img
-drive if=virtio,format=raw,file=cloud_init_with_pass.iso

This qemu command sets the accel to kvm, and i had no issues. I'm guessing that the drive setup is very different though.

From my working knowledge of libvirt and cloud-init, you do need to mount the cloud-init image in a specific place, and I don't think there would be an issue, generally, with the kvm image not getting sr0 up fast enough. qemu is mounting to the same place in that command.

Could you provide the libvirt XML definition and exact reproduction steps for us to dig a little deeper?

ubuntu-server-builder avatar May 12 '23 14:05 ubuntu-server-builder

Launchpad user Chad Smith(chad.smith) wrote on 2021-09-01T17:51:09.861503+00:00

I also haven't been able to reproduce on focal. It makes me think that there is a potential systemd unit ordering cycle on the image/config that represented this issue?

on focal I see the reverse deps on latest daily images:

root@dev-ff:~# systemctl list-dependencies --reverse cloud-init.target cloud-init.target ● └─multi-user.target ● └─graphical.target root@dev-ff:~# lsb_release -sc focal

A guess in the dark would be to check is journalctl -b 0 and look for "ordering cycle" related messages too.

ubuntu-server-builder avatar May 12 '23 14:05 ubuntu-server-builder

Launchpad user Adam Chasen(achasen) wrote on 2021-09-01T20:10:03.834280+00:00

able to reproduce with image created with

virt-install --connect qemu:///session \                                   
--name cloudinit-test \
--memory 2048 \
--disk /home/achasen/tmp/focal.img,device=disk,bus=virtio \
--os-type linux \
--os-variant ubuntu20.04 \
--virt-type kvm \
--graphics none \
--network bridge=virbr0,model=virtio \
--import \
--disk /tmp/test.iso,device=cdrom,bus=sata

/run/cloud-init/cloud-init-generator.log indicated run around 0.69s:

No ds found [mode=search, notfound=disabled]. Disabled cloud-init [1]
[up 0.69s] returning 1

jornalctl shows things like "Starting Network Service" before sr0 is in the log (which makes me think the sr0 is delayed). I didn't find anything in journalctl output related to the generator.

[    1.857890] ubuntu systemd[1]: Starting Network Service...
...
[    2.364539] ubuntu kernel: ata3: SATA link down (SStatus 0 SControl 300)
[    2.364609] ubuntu kernel: ata5: SATA link down (SStatus 0 SControl 300)
[    2.364642] ubuntu kernel: ata1.00: ATAPI: QEMU DVD-ROM, 2.5+, max UDMA/100
[    2.364643] ubuntu kernel: ata1.00: applying bridge limits
[    2.364884] ubuntu kernel: ata1.00: configured for UDMA/100
[    2.364350] ubuntu kernel: ata4: SATA link down (SStatus 0 SControl 300)
[    2.364426] ubuntu kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl>
[    2.364539] ubuntu kernel: ata3: SATA link down (SStatus 0 SControl 300)
[    2.364609] ubuntu kernel: ata5: SATA link down (SStatus 0 SControl 300)
[    2.364642] ubuntu kernel: ata1.00: ATAPI: QEMU DVD-ROM, 2.5+, max UDMA/100
[    2.364643] ubuntu kernel: ata1.00: applying bridge limits
[    2.364884] ubuntu kernel: ata1.00: configured for UDMA/100
[    2.365032] ubuntu kernel: scsi 0:0:0:0: CD-ROM            QEMU     QEMU DVD>
[    2.365242] ubuntu kernel: sr 0:0:0:0: [sr0] scsi3-mmc drive: 4x/4x cd/rw xa>
[    2.365250] ubuntu kernel: cdrom: Uniform CD-ROM driver Revision: 3.20
[    2.379293] ubuntu kernel: sr 0:0:0:0: Attached scsi CD-ROM sr0
[    2.416795] ubuntu systemd[1]: Finished udev Wait for Complete Device Initia>
[    2.417385] ubuntu systemd[1]: Starting Device-Mapper Multipath Device Contr>

Launchpad attachments: virsh dumpxml result from virt-install

ubuntu-server-builder avatar May 12 '23 14:05 ubuntu-server-builder

Launchpad user Vincent Saelzler(vincentsaelzler) wrote on 2021-09-24T19:33:11.755335+00:00

I have the same issue with the Azure/Hyper-V Image. Running on local Windows desktop, using Hyper-V as the hypervisor.

Steps to reproduce:

  1. Download and extract https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-azure.vhd.zip. Save disk image as 20.04-cloud.vhd.

  2. Create my-seed.iso file almost exactly as described in cloud-init documentation. Only small tweak is saving as ISO instead of IMG. https://cloudinit.readthedocs.io/en/latest/topics/debugging.html

$ cat > user-data <<EOF #cloud-config password: passw0rd chpasswd: { expire: False } EOF $ cloud-localds my-seed.iso user-data

  1. Create new VM using Hyper-V GUI
  • Virtual Hard Disk Image = 20.04-cloud.vhd
  • Virtual DVD Drive Image = my-seed.iso

=> After starting the VM, I cannot log in.

Possibly helpful note: When using the standard (non-cloud) installer, this file seems to prevent the VM from using an ISO attached to the system: /etc/cloud/cloud.cfg.d/99-installer.cfg

It saves the user details that I manually entered during the install process, and critically, explicitly sets the data source to none.

$ cat /run/cloud-init/ds-identify.log /etc/cloud/cloud.cfg.d/99-installer.cfg set datasource_list: [None]

After deleting the file, the ISO was recognized (and PW of "passw0rd" for ubuntu user worked)

$ cat /run/cloud-init/ds-identify.log /etc/cloud/cloud.cfg.d/90_dpkg.cfg set datasource_list: [ NoCloud, ConfigDrive, OpenNebula, DigitalOcean, Azure, AltCloud, OVF, MAAS, GCE, OpenStack, CloudSigma, SmartOS, Bigstep, Scaleway, AliYun, Ec2, CloudStack, Hetzner, IBMCloud, Oracle, Exoscale, RbxCloud, UpCloud, Vultr, None ]

I do not know how to get debug output from the cloud image, because I cannot login as any user! If someone can explain how to do that, I would be happy to provide more output from the cloud image VM.

ubuntu-server-builder avatar May 12 '23 14:05 ubuntu-server-builder

Launchpad user Gauthier Jolly(gjolly) wrote on 2021-09-28T07:50:39.603788+00:00

Hi Vincent,

Thank you for your comment. What you are seeing with the Azure cloud-images is not related with the current issue.

Azure VHDs you can find on c-i.u.c are the same images we publish on Azure Cloud. Those are configured with a single Cloud-Init datasource (Azure) to make the image boot faster. While it is possible to boot those images locally on hyper-v, you will end up with a VM that is not fully functional.

If you look carefully at the bug description, you will see that @achasen uses KVM images (not Azure images) that should work out of the box on KVM.

ubuntu-server-builder avatar May 12 '23 14:05 ubuntu-server-builder

Launchpad user Launchpad Janitor(janitor) wrote on 2021-11-28T04:17:19.681502+00:00

[Expired for cloud-init because there has been no activity for 60 days.]

ubuntu-server-builder avatar May 12 '23 14:05 ubuntu-server-builder

Launchpad user Launchpad Janitor(janitor) wrote on 2021-11-28T04:17:21.014769+00:00

[Expired for cloud-images because there has been no activity for 60 days.]

ubuntu-server-builder avatar May 12 '23 14:05 ubuntu-server-builder

Launchpad user Chad Smith(chad.smith) wrote on 2022-02-25T00:42:41.882156+00:00

I apologize for the expiry on this bug it slipped through the cracks as it was set to incomplete status which eventually expires if not set back to New.

The reason we don't have cloud-init included in your boot target is due to the ds-identify generator not seeing the /dev/sr0 yet with a cidata label due to what appears to be a later module load.

Cloud-init can tell you on focal that it's disabled due to the generator-time failure to find a matching datasource.

root@focal:~# cloud-init status --long status: disabled detail: Cloud-init disabled by cloud-init-generator

I am able to reproduce the original error with the following steps and as Adam suggested: $ sudo virt-install --connect qemu:///session --name cloudinit-test --memory 2048 --disk /home/csmith/src/cloud-init/focal-server-cloudimg-amd64-disk-kvm.img,device=disk,bus=virtio --os-type linux --os-variant ubuntu20.04 --virt-type kvm --graphics none --network bridge=virbr0,model=virtio --import --disk "/tmp/test.iso,device=cdrom,bus=sata"

On Focal, we can see /run/cloud-init/ds-identify.log which is emitted when cloud-init's generator runs beats the journalctl -b 0 timing of when the /dev/sr0 is seen due to later kernel module load.

from journalctl:

Feb 24 21:56:28 ubuntu kernel: sr 0:0:0:0: Attached scsi CD-ROM sr0

root@focal:~# ls -ltr --full-time /dev/disk/by-label/ /run/cloud-init/ds-identify.log

Generator time 21:56:27

-rw-r--r-- 1 root root 1504 2022-02-24 21:56:27.241872017 +0000 /run/cloud-init/ds-identify.log

/dev/sr0 availability no until 1 second later

/dev/disk/by-label/: total 0 lrwxrwxrwx 1 root root 10 2022-02-24 21:56:28.173872017 +0000 cloudimg-rootfs -> ../../vda1 lrwxrwxrwx 1 root root 9 2022-02-24 21:56:28.441872017 +0000 cidata -> ../../sr0 lrwxrwxrwx 1 root root 11 2022-02-24 21:56:28.581872017 +0000 UEFI -> ../../vda15

This needs a bit more investigation and probably can be worked around with add the virt-install argument --sysinfo system.serial='ds=nocloud' which will force ds-identify to detect NoCloud regardless of the presence of /dev/sr0. Since the device will be up before NoCloud.get_data is run, this will avoid the race.

ubuntu-server-builder avatar May 12 '23 14:05 ubuntu-server-builder

Launchpad user James Falcon(falcojr) wrote on 2022-02-25T16:31:05.953838+00:00

A duplicate bug, https://bugs.launchpad.net/bugs/1961832 , provides some additional context and consistent reproduction steps.

ubuntu-server-builder avatar May 12 '23 14:05 ubuntu-server-builder

due to what appears to be a later module load

This is probably the issue, the user is likely using a sata/ide based CD-ROM which many guides/tutorials/hypervisor defaults will do.

TLDR:

Couple of options that I use depending upon the scenario.

Use a different underlying device

Funny thing is that as long as the ISO is marked as ISO9660 you can pretty much use any block device type. But here are two I use fairly often that have decent compatibility across most distros/ kernel combinations

Virtio-blk-pci
-drive file="<your cloudinit iso>",format=raw,if=none,id=cidata
-device virtio-blk-pci,drive=cidata
scsi-cdrom
-device virtio-scsi-pci,id=virtioscsi1,bus=pci.3,addr=0x2
-drive file=<your cloudinit iso>,if=none,id=drive-scsi1,media=cdrom,aio=io_uring
-device scsi-cd,bus=virtioscsi1.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi1,id=scsi1

Kernel Commandline modifications

If you really need to use IDE load the modules you need either using kernel command line

modules-load=libata,ahci,sr_mod

This will load the dependency chain of modules required to load IDE based cdroms from most linux hypervisors (more on that below).

load the modules earlier in the initrd/initramfs

This is more dependent upon the distro and bootloader you're using so it's tough to give advice here. I'm going to link to this page instead and hopefully point the more advanced users in the right direction if they want this option. https://manpages.ubuntu.com/manpages/xenial/man5/modules-load.d.5.html

Generally you need to drop a .conf file in a folder similarly called /etc/modules-load.d/ with the modules you want to load and rebuild the init

Compile the drivers into the kernel instead of modules

The underlying reason why this issue is occurring for IDE drives is the drivers for them are no longer baked into most cloud image's kernels and instead are plugged in using modules (more on all that below).

Whats happening and why did it work before and not now?

ds-identify is invoked by systemd generators, which occurs before any of the units are loaded (because this function is generally used to generate systemd units and not for what ds-identify is doing).

The IDE drive doesn't get loaded until later because the unit that loads modules and listens for kernel device events aren't loaded yet! Most cloudimages do not mark the IDE modules to preload in the init-disk. If I had to take a guess earlier kernels still compiled these drivers right in instead of using modules, or they were preload in the init disk before and aren't any longer.

You can check to see if your kernel has these compiled in, or if it's loading it as a module. Here are the config flags for related QEMU IDE/SATA devices

cat /boot/config-$(uname -r) | grep -e CONFIG_SATA_AHCI= -e CONFIG_ATA= -e CONFIG_BLK_DEV_SR=

y = it's built into the kernel, m = it's being loaded as a module.

Ubuntu Noble cloud image's config:

CONFIG_ATA=y
CONFIG_SATA_AHCI=m
CONFIG_BLK_DEV_SR=m

kolonelkrazy avatar Apr 13 '25 16:04 kolonelkrazy