cloud-init icon indicating copy to clipboard operation
cloud-init copied to clipboard

fs_setup/disk_setup: option to wait for the device to exist before continuing

Open ubuntu-server-builder opened this issue 2 years ago • 10 comments

This bug was originally filed in Launchpad as LP: #1832645

Launchpad details
affected_projects = []
assignee = None
assignee_name = None
date_closed = None
date_created = 2019-06-12T20:48:53.539989+00:00
date_fix_committed = None
date_fix_released = None
id = 1832645
importance = medium
is_complete = False
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1832645
milestone = None
owner = minfrin-y
owner_name = Graham Leggett
private = False
status = in_progress
submitter = minfrin-y
submitter_name = Graham Leggett
tags = []
duplicates = [1907080]

Launchpad user Graham Leggett(minfrin-y) wrote on 2019-06-12T20:48:53.539989+00:00

When using the AWS::EC2::Volume and AWS::EC2::VolumeAttachment options to add a volume to an AWS::EC2::Instance on AWS EC2, the volume is not immediately available on the instance.

This causes fs_setup and disk_setup to fail.

What would prevent this failure is a "wait" option on both fs_setup and disk_setup, which if true, will cause cloud-init to wait until the device exists (caused by AWS catching up and attaching the device) before continuing.

ubuntu-server-builder avatar May 11 '23 20:05 ubuntu-server-builder

Launchpad user Ryan Harper(raharper) wrote on 2019-07-18T20:48:32.739674+00:00

Hi,

Thanks for filing the bug. Could you describe the launch process in a bit more detail? Specifically, have the API calls to attach the volume run before the instance is booted? Is it that the volumes arrive after the instance has started booting? Are you providing your own cloud-config with fs_setup/disk_setup cloud-config? Or do these volumes show up in the EC2 metadata (block-device-mapping)?

If possible, can you run 'cloud-init collect-logs' as root and attach the tarball output on a failing instance?

Thanks!

ubuntu-server-builder avatar May 11 '23 20:05 ubuntu-server-builder

Launchpad user Graham Leggett(minfrin-y) wrote on 2019-07-18T23:56:16.957762+00:00

We don't have a failing instance in this case, as the last time we tried this was a few years ago. The workaround was to not use the AWS::EC2::VolumeAttachment at all, but rather to create an EBS volume as part of the AWS::EC2::Instance. This has other side effects we want to avoid, thus this bug.

Look carefully at the definition of a AWS::EC2::VolumeAttachment:

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ec2-ebs-volumeattachment.html

A volume attachment depends on both a volume, and an instance. Both the volume and the instance have to exist before the volume attachment can exist. By definition that means that the instance is started up before the volume is attached to the instance.

If we instruct cloud-init to prepare the volume with a filesystem on it (and we want to) this will fail, because the attempt to prepare the volume happens before the volume has attached. Cloud-init fails, orchestration fails, and all is lost.

All we're looking for is the option for cloud-init to say "I have been asked to set up this disk. This disk does not yet exist. Instead of throwing a fatal error and failing, I will wait until this disk does exist, and then I will continue to do what I need to do after that as normal as if nothing had happened".

ubuntu-server-builder avatar May 11 '23 20:05 ubuntu-server-builder

Launchpad user Launchpad Janitor(janitor) wrote on 2019-09-17T04:17:45.972557+00:00

[Expired for cloud-init because there has been no activity for 60 days.]

ubuntu-server-builder avatar May 11 '23 20:05 ubuntu-server-builder

Launchpad user Maurizio(mauri-maurizio) wrote on 2020-03-04T11:33:02.065435+00:00

Hi all, we had a very similar issue here. We didn't catch if there was a proposed solution.

The workaround (not use the AWS::EC2::VolumeAttachment at all, but rather to create an EBS volume as part of the AWS::EC2::Instance) is not ideal because we would like to have that flexibility. As a possible solution, we were waiting for disk mount completion with a conditional look in the bootcmd but it seems asynchronous.

Do you have any suggestions?

ubuntu-server-builder avatar May 11 '23 20:05 ubuntu-server-builder

Launchpad user James Thompson(james-thompson) wrote on 2020-09-02T20:49:32.988423+00:00

This would be useful to me as well.

ubuntu-server-builder avatar May 11 '23 20:05 ubuntu-server-builder

Launchpad user Henry Ford(hrford) wrote on 2020-10-30T21:30:06.292679+00:00

/bin/cloud-init 19.3-3.amzn2

I have this same issue and now cannot use cloud-init's mounts module.

Occasionally cloud-init would not find the (late) attachment and then add "None" to the device column of fstab. This means, even if the instance is rebooted, the mount would not be retried by the system.

In cloudformation; I'm not using an explicit volume attachment resource, but an implicit one.

Second to this, fs_setup also suffers from a similar issue and formats the volume if it's not attached at first and then attached shortly after, (Maybe a different bug).

For reference, and to help others, my work-around is to apply the logic in low-level bash, (which I wanted to avoid):

runcmd:

  • "[ ! -b /dev/xvdf ] && (echo "ERROR: xvdf not attached. Will sleep 30s..."; sleep 30;)"

volume is formatted to ext4? OK, else format

  • blkid -o full /dev/xvdf | grep "ext4" && echo "xvdf is ext4" || mkfs -t ext4 /dev/xvdf -L label

no mountpoint? make it

  • "[ ! -d /mnt/ebs/ ] && mkdir /mnt/ebs"

volume is in fstab? if not: add it

  • 'grep -q "/dev/xvdf" /etc/fstab && echo "xvdf already in fstab" || echo "/dev/xvdf /mnt/ebs ext4 defaults,nofail 0 2" >> /etc/fstab'

mount volumes added above

  • mount -a

My work-around still provides hope even if the first boot missed the attachment as fstab contains the correct info. Although I wouldn't have solved an unformatted volume because second boot doesn't use runcmd.

ubuntu-server-builder avatar May 11 '23 20:05 ubuntu-server-builder

Launchpad user Dan Watkins(oddbloke) wrote on 2020-11-02T19:23:39.842761+00:00

Adding a way of configuring a wait at boot seems reasonable. Are any of the various people who've experienced this interested in contributing such a change?

ubuntu-server-builder avatar May 11 '23 20:05 ubuntu-server-builder

Launchpad user Ryan Harper(raharper) wrote on 2020-12-07T16:53:03.047544+00:00

A fix is being worked on here: https://github.com/canonical/cloud-init/pull/710

ubuntu-server-builder avatar May 11 '23 20:05 ubuntu-server-builder

I have the same problem of wanting to use the disk_setup module for formatting an EBS volume, but the EBS volume may not be available right at boot.

My workaround is to use a bootcmd that pauses cloud-init until the given device is available:

bootcmd:
  # https://github.com/canonical/cloud-init/issues/3386
  - |
    filename_to_wait_for="/dev/nvme1n1"

    # Timeout in seconds
    timeout=600 # 10 minutes

    # Check every `interval` seconds
    interval=5

    elapsed=0
    while [ ! -e "$filename_to_wait_for" ]; do
        sleep "$interval"
        elapsed=$((elapsed + interval))

        if [ "$elapsed" -ge "$timeout" ]; then
            echo "Timeout reached. File not found: $filename_to_wait_for"
            exit 1
        fi
    done

    echo "File found: $filename_to_wait_for"

By default, the bootcmd module runs before the disk_setup module, so this works make sure the EBS volume in /dev/nvme1n1 exists before cloud-init tries to run the disk_setup module.

cdepillabout avatar Jun 13 '23 08:06 cdepillabout

https://github.com/canonical/cloud-init/pull/4673 should fix this, PTAL.

flokli avatar Dec 07 '23 18:12 flokli