qubes-issues
qubes-issues copied to clipboard
VMs fail to start because adjusting filesystem size times out
Qubes OS version:
4.0
Affected component(s):
VMs based on Fedora-26, likely debian-9 too.
Steps to reproduce the behavior:
Some VMs fail to start: Cannot connect to qrexec agent for 60 seconds, see /var/log/xen/console/guest-xxx.log for details
.
In that log file, I can see that the problematic VMs are busy with the following task (the time updates until around 55s when the VM is killed):
A start job is running for Adjust r…t filesystem size (45s / no limit)
Expected behavior:
The VMs should start, even if that takes 10 minutes. Beyond 60 seconds it is reasonable to show a message and invite the user to check the logs to see what's going on.
Actual behavior:
The VM gets killed, preventing access to its data and applications.
General notes:
After seeing this commit https://github.com/marmarek/qubes-core-agent-linux/commit/7b4d453a23b066e5919f5498c5b3c75f8df2bae1 I looked for scripts named *resize*
in /usr/lib
, found one (resize-rootfs-if-needed.sh
I think, but there are a couple), and disabled it (added exit 0
at the top). This allowed me to boot the VMs. I grepped through QubesOS's code but couldn't find that 60-second timeout is defined, 60 is a fairly common number it seems :) .
Related issues:
Related to the fix at https://github.com/marmarek/qubes-core-agent-linux/commit/7b4d453a23b066e5919f5498c5b3c75f8df2bae1 for #3173 and #3143.
Resizing filesystem normally takes seconds, do you run the system from USB or other slow disk?
This allowed me to boot the VMs. I grepped through QubesOS's code but couldn't find that 60-second timeout is defined, 60 is a fairly common number it seems :)
It's controlled from dom0: qvm-prefs VNMAME qrexec_timeout [new-value]
I'm getting the same issue. Normal VM running from local install on ssd. I've recently installed updates and I think VM failed to start after that (I've only noticed a single VM that failed to start).
I tried cloning the affected VM and change it's underlying template but in both cases VM failed to start
@marmarek The VM booted fine in about 3 minutes after I increased the timeout. I have a slightly older SSD and CPU. I do tend to run VMs with small amounts of RAM allocated.
I had tried cloning the problematic VM like @yonjah, without success.
I noticed that now that I increased the timeout, the VM does boot fine, but it takes about 3 minutes to resize that root partition every time it is booted.
Changes in root filesystem on template-based VMs are not persistent. Start the template - it should resize root filesystem on boot and change will persist there.
@marmarek starting the template seem to have no effect on the appVM boot time. Also template start with no noticeable delay.
After increasing the timeout I checked the vm journal for the boot log and it seem like the delay is happening between mounting the temporary directory and init the crng
Oct 29 20:42:50 localhost systemd[1]: Mounting Temporary Directory...
Oct 29 20:42:50 localhost systemd[1]: Mounted Temporary Directory.
Oct 29 20:44:54 localhost kernel: random: crng init done
Oct 29 20:44:54 localhost kernel: random: 7 urandom warning(s) missed due to ratelimiting
crng init happen in parallel (asynchronously), it doesn't block anything at this stage. What is the next thing in the log? systemd-analyze may be useful to find what's wrong
As for the template - check (df -h /
) if the filesystem is really resized there - maybe the problem is it isn't happening at all? Check also sudo systemctl status qubes-rootfs-resize.service
.
I'm attaching the output of the commands. But I'm not sure how to track this issue farther
systemd-analyze blame
1min 41.077s qubes-rootfs-resize.service
4.287s qubes-misc-post.service
3.591s dev-xvda3.device
3.034s qubes-update-check.service
1.042s qubes-firewall.service
df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/xvda3 9.6G 4.2G 5.0G 46% /
sudo systemctl status qubes-rootfs-resize.service
● qubes-rootfs-resize.service - Adjust root filesystem size
Loaded: loaded (/usr/lib/systemd/system/qubes-rootfs-resize.service; enabled; vendor preset: enabled)
Active: active (exited) since Mon 2018-10-29 21:09:03 AWST; 45min ago
Main PID: 269 (code=exited, status=0/SUCCESS)
Tasks: 0 (limit: 4915)
CGroup: /system.slice/qubes-rootfs-resize.service
Oct 29 21:07:22 localhost systemd[1]: Starting Adjust root filesystem size...
Oct 29 21:09:03 localhost resize-rootfs-if-needed.sh[269]: Warning: Partition /dev/xvda3 is being used. Are you sure you want to continue?
Oct 29 21:09:03 localhost resize-rootfs-if-needed.sh[269]: [168B blob data]
Oct 29 21:09:03 localhost resize-rootfs-if-needed.sh[269]: Information: You may need to update /etc/fstab.
Oct 29 21:09:03 localhost resize-rootfs-if-needed.sh[269]: [106B blob data]
Oct 29 21:09:03 localhost resize-rootfs-if-needed.sh[269]: The filesystem is already 2569467 (4k) blocks long. Nothing to do!
Oct 29 21:09:03 localhost systemd[1]: Started Adjust root filesystem size.
Ok, looks like resize-rootfs script (called from resize-rootfs-if-needed.sh) waits for something. Can you try to run it manually, with more detailed logging, like this:
sudo sh -x /usr/lib/qubes/resize-rootfs
? And observe where the wait happen.
When I run it manually the delay doesn't hapen. I changed the startup script in the template to get the commands and this is the output -
Oct 30 19:20:28 localhost systemd[1]: Starting Adjust root filesystem size...
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: ++ blockdev --getro /dev/xvda
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: + '[' 0 -eq 1 ']'
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: + sysfs_xvda=/sys/class/block/xvda
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: + non_rootfs_data=512000
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: ++ tail -n 1
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: ++ df --output=size /
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: + rootfs_size=' 10050748'
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: + rootfs_size=20101496
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: ++ cat /sys/class/block/xvda/size
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: + '[' 20971520 -lt 20613496 ']'
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: + /usr/bin/sh -x /usr/lib/qubes/resize-rootfs
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: + set -e
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: +++ cut -f 1 -d ' '
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: +++ grep device-mapper /proc/devices
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: ++ printf %x 253
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: + dm_major=fd
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: + case "$(stat -Lc %t:%T /dev/mapper/dmroot)" in
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: ++ stat -Lc %t:%T /dev/mapper/dmroot
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: + parted ---pretend-input-tty /dev/xvda print
Oct 30 19:20:28 localhost resize-rootfs-if-needed.sh[270]: + echo fix
Oct 30 19:22:01 localhost resize-rootfs-if-needed.sh[270]: + parted ---pretend-input-tty /dev/xvda resizepart 3
Oct 30 19:22:01 localhost resize-rootfs-if-needed.sh[270]: + echo yes 100%
Oct 30 19:22:01 localhost resize-rootfs-if-needed.sh[270]: Warning: Partition /dev/xvda3 is being used. Are you sure you want to continue?
Oct 30 19:22:01 localhost resize-rootfs-if-needed.sh[270]: [168B blob data]
Oct 30 19:22:01 localhost resize-rootfs-if-needed.sh[270]: Information: You may need to update /etc/fstab.
Oct 30 19:22:01 localhost resize-rootfs-if-needed.sh[270]: [97B blob data]
Oct 30 19:22:01 localhost resize-rootfs-if-needed.sh[270]: + resize2fs /dev/mapper/dmroot
Oct 30 19:22:01 localhost resize-rootfs-if-needed.sh[270]: resize2fs 1.43.4 (31-Jan-2017)
Oct 30 19:22:01 localhost resize-rootfs-if-needed.sh[270]: The filesystem is already 2569467 (4k) blocks long. Nothing to do!
Oct 30 19:22:01 localhost systemd[1]: Started Adjust root filesystem size.
It seem like the delay is happening on the echo fix | parted ---pretend-input-tty /dev/xvda print >/dev/null
ca:3)
# resize partition table itself
# use undocumented ---pretend-input-tty (yes, three '-') to
# force unattended operation, otherwise it aborts on first
# prompt, even with '-s' option
echo fix | parted ---pretend-input-tty /dev/xvda print >/dev/null
# then resize 3rd partition, even though it is mounted
echo yes 100% | parted ---pretend-input-tty /dev/xvda resizepart 3
# and reload partition table; prefer partprobe over blockdev
# --rereadpt, as it works on mounted partitions
partprobe /dev/xvda
;;
I can probably remove the output redirect and try and see if there is any meaningful output
So I displayed the input and it just prints the partition table after the delay.
Oct 30 19:46:21 localhost resize-rootfs-if-needed.sh[268]: + parted ---pretend-input-tty /dev/xvda print
Oct 30 19:46:21 localhost resize-rootfs-if-needed.sh[268]: + echo fix
Oct 30 19:47:49 localhost resize-rootfs-if-needed.sh[268]: Model: Xen Virtual Block Device (xvd)
Oct 30 19:47:49 localhost resize-rootfs-if-needed.sh[268]: Disk /dev/xvda: 10.7GB
Oct 30 19:47:49 localhost resize-rootfs-if-needed.sh[268]: Sector size (logical/physical): 512B/4096B
Oct 30 19:47:49 localhost resize-rootfs-if-needed.sh[268]: Partition Table: gpt
Oct 30 19:47:49 localhost resize-rootfs-if-needed.sh[268]: Disk Flags:
Oct 30 19:47:49 localhost resize-rootfs-if-needed.sh[268]: Number Start End Size File system Name Flags
Oct 30 19:47:49 localhost resize-rootfs-if-needed.sh[268]: 1 1049kB 211MB 210MB EFI System boot, esp
Oct 30 19:47:49 localhost resize-rootfs-if-needed.sh[268]: 2 211MB 213MB 2097kB BIOS boot partition bios_grub
Oct 30 19:47:49 localhost resize-rootfs-if-needed.sh[268]: 3 213MB 10.7GB 10.5GB ext4 Root filesystem
I assume fixing the GPT is what causing the delay but I couldn't get any more info on what is going on or why GPT is broken in the first place
@yonjah A GPT contains a main partition table at the beginning of the disk, and a backup one at the end of the disk. If the disk is expanded, then backup will not be at the end of the disk, so this most likely is what needs "fixing".
Would we have better luck with sgdisk
, which is designed for non-interactive use and is able to edit GPT partition tables?
@marmarek I also see that it continues after parted
as soon as the crng init
is done.
GPT labels contain some GUIDs that are randomly generated. I noticed recently while working with gdisk
that when the GPT mirror is misplaced and needs to be recreated, or for some partition operations, it generates a fresh GUID for the disk (or for the partition). To generate the fresh GUID it quite likely needs a source of randomness, which could explain why it blocks until crng
is available?
To generate the fresh GUID it quite likely needs a source of randomness, which could explain why it blocks until
crng
is available?
That's plausible explanation. Maybe it's possible to convince the tool to not do that? Since we don't really add new partitions, it should be ok to not change any of those uuids. Using sgdisk may indeed be a better idea.
I have been piping commands into gdisk
(before I realised the non-interactive version sgdisk
existed), and I had no control over the generation of GUIDs. The only thing I was able to do to get a deterministic, reproducible disk image was to let it generate random GUIDs, then set them to a constant afterwards (it has a command for that). That won't prevent it from trying to get some random bytes, though. I haven't checked but I expect parted to behave similarly :-( .
Why is the crng init
taking so long? And could we put a dependency between the jobs, so that the resizing job happens after the crng
job?
There are very few randomness sources VM have access to. You can read more about it in https://github.com/QubesOS/qubes-issues/issues/673
You can check ordering of systemd-random-seed.service
, haveged.service
and this one. Having systemd-random-seed.service
started before this one should help (but not sure if solve it fully).
@marmarek @jsmaniac I kinda understand what GPT is. But I was wondering why it is broken in the first place (which I assumed fixing it is what causing the delay). and if there is no way for the fix to persist so it wont have to happen on every restart.
I understand it might actually be parted just waiting for the PRNG to load. but then why is it only happening on this vm and not on others ?
(I want to follow this but I don't know how to do that in Github so I will just write something here)
(I want to follow this but I don't know how to do that in Github so I will just write something here)
There is "subscribe" button on the right.
I managed to consistently reproduce the issue:
qvm-create --class AppVM --template fedora-26 --label red foobar
# This boots fast.
time qvm-run foobar 'touch somefile'
qvm-run foobar 'sudo halt -p' || true
# Wait for the VM to halt.
while qvm-check --running foobar; do sleep 1; done
sleep 5
qvm-prefs --set foobar maxmem 400
qvm-prefs --set foobar memory 400
# This will take a long time to boot
time qvm-run foobar true
When the amount of RAM allocated to the VM is reduced below a certain point and there is some data on the VM's private
volume, it blocks on the resizing step which takes several dozens of seconds.
With an empty VM like this, the second boot is 4x slower than the first. The boot time can go up to 2-3 minutes on other VMs.
If I re-increase maxmem
to 4000
, then the VM boots fast again.
I noticed that when booting such a VM with lots of RAM available, it uses around 1400M of RAM at the end of the boot process, which is way to much to start a simple terminal.
While the actual issue (parted waiting for crng init, even though no new uuid should be needed) is not fixed yet, it's impact should be greatly reduced now because of https://github.com/QubesOS/qubes-issues/issues/4553 (do not call root fs resize at all, if not needed).
It seems that the issue mostly happens with VMs based on fedora-26
, and even when starting the template itself (it takes forever to resize, and now does it every time, even if the timeout is increased and it boots successfully several times in a row). VMs based on debian-9 templates don't seem to have the issue on my machine (it might have happened with a previous installed, but I might not remember correctly). I'll see if the issue arises with fedora-29.
PS: @marmarek thanks for the answer on HVM security on the mailing list the other day ;-)
It seems that the issue mostly happens with VMs based on
fedora-26
This makes sense as there are not updates for fedora-26 anymore, and that include a fix for #4553.
@marmarek Yep, I saw the announcement for fedora-29 on the mailing list the other day and promptly installed it. However, I was not aware of new versions of fedora templates before that, since qubes-dom0-update
does not install them automatically.
I would have expected a metapackage to bring in new official templates when qubes-dom0-update
is executed. It seems that the need to manually install new TemplateVMs is not indicated in https://www.qubes-os.org/doc/software-update-dom0/ nor in https://www.qubes-os.org/doc/software-update-vm/ (I might have missed it). Is this usability issue tracked somewhere or should I open a ticket?
I can also confirm that upgrading to fedora-29 solved this issue
@marmarek Yep, I saw the announcement for fedora-29 on the mailing list the other day and promptly installed it. However, I was not aware of new versions of fedora templates before that, since
qubes-dom0-update
does not install them automatically.I would have expected a metapackage to bring in new official templates when
qubes-dom0-update
is executed. It seems that the need to manually install new TemplateVMs is not indicated in https://www.qubes-os.org/doc/software-update-dom0/ nor in https://www.qubes-os.org/doc/software-update-vm/ (I might have missed it). Is this usability issue tracked somewhere or should I open a ticket?
Such behavior would lead to the removal of any programs/packages you install yourself, so it wouldn't be very convenient default behavior. Similarly, installing every fedora distro released in parallel would also be inconvenient because it would take up quite a bit of space for people who don't wish to change their templates every 6-8 months. (Though I understand that fedora is moving away from this model starting with 30/31?)
As for the original issue, it seems sfdisk
in fedora-28 do support GPT too and is able to resize partition without loading any random data. The command is:
echo ',+' | sfdisk -N 3 /dev/xvda
It handle both adjusting backup GPT place and resizing in one go.
sfdisk in Debian stretch (util-linux 2.29.2) also supports it, but is buggy: https://github.com/karelzak/util-linux/issues/532 This is fixed in util-linux 2.32.1, which is only in buster+
Automated announcement from builder-github
The package core-agent-linux
has been pushed to the r4.0
testing repository for the CentOS centos7
template.
To test this update, please install it with the following command:
sudo yum update --enablerepo=qubes-vm-r4.0-current-testing
Automated announcement from builder-github
The package qubes-core-agent_4.0.43-1
has been pushed to the r4.1
testing repository for the Debian template.
To test this update, first enable the testing repository in /etc/apt/sources.list.d/qubes-*.list
by uncommenting the line containing stretch-testing
(or appropriate equivalent for your template version), then use the standard update command:
sudo apt-get update && sudo apt-get dist-upgrade
Automated announcement from builder-github
The package qubes-core-agent_4.0.43-1
has been pushed to the r4.0
testing repository for the Debian template.
To test this update, first enable the testing repository in /etc/apt/sources.list.d/qubes-*.list
by uncommenting the line containing stretch-testing
(or appropriate equivalent for your template version), then use the standard update command:
sudo apt-get update && sudo apt-get dist-upgrade