systemd-bootchart icon indicating copy to clipboard operation
systemd-bootchart copied to clipboard

Assertion 'head' failed at src/svg.c:79, function svg_header()

Open fsateler opened this issue 8 years ago • 17 comments

I have a reproducible assertion failure with bootchart v231 when invoked as pid1, citing the above message.

The problem seems to only occur when the initrd does not mount / as rw, and it is instead done by systemd-remount-fs.

I have managed to reproduce as follows:

# Build a bootable debian image including systemd-bootchart
mkosi -p systemd-bootchart -b -d debian
# Boot into it and set the kernel command line
qemu-system-x86_64 -m 512 -bios /usr/share/ovmf/OVMF.fd --enable-kvm -drive file=image.raw,format=raw -vga qxl
# this is inside the vm
mount /dev/sda1 /boot/efi
echo "quiet init=/lib/systemd/systemd-bootchart" > /etc/kernel/cmdline
# Ensure systemd remounts / as rw to avoid unrelated failures
echo "/dev/sda2 / ext4 errors=remount-ro 0 1" > /etc/fstab
kernel-install add $(uname -r) /boot/vmlinuz-$(uname -r)
# Disable getty on tty1 so that we do not loose the boot messages
systemctl mask getty@tty1
reboot
# Now we have the assertion error during boot

I notice that the errors do not present themselves if the initrd mounts the root filesystem as rw (ie, add rw to the kernel cmdline).

fsateler avatar Oct 03 '16 18:10 fsateler

Does your rootfs stay read-only forever?

If this is a case where you run bootchart just very briefly, and the system just hasn't gotten the rootfs mounted read-write fast enough, you can opt to just make bootchart run longer.

You can also opt to output the bootchart to a different location, perhaps one that is read-write.

It's not up to systemd-bootchart to mount filesystems read-write - you'll have to provide this to systemd-bootchart in some way yourself.

sofar avatar Oct 03 '16 20:10 sofar

Does your rootfs stay read-only forever?

No, it is remounted rw by sytemd-remount-fs.

If this is a case where you run bootchart just very briefly, and the system just hasn't gotten the rootfs mounted read-write fast enough, you can opt to just make bootchart run longer.

You can also opt to output the bootchart to a different location, perhaps one that is read-write.

It's not up to systemd-bootchart to mount filesystems read-write - you'll have to provide this to systemd-bootchart in some way yourself.

I suspect the problem is not (only) this, as the userspace boot is just 653ms in this VM:

Startup finished in 4.093s (firmware) + 392ms (loader) + 1.109s (kernel) + 653ms (userspace) = 6.248s

Moreover, when invoking systemd-bootchart with -o flag to a read-only directory, it correctly tells me it cannot open the file. Also, /run/log is a tmpfs so it is of course rw.

On boot, after the assert, an empty file can be found in /run/log.

It appears something is causing systemd-bootchart to try to write out the svg file before it has collected any samples, but I don't know how to test this...

fsateler avatar Oct 03 '16 20:10 fsateler

Hmm indeed, the assert fails because nothing has been logged(likely), or the list head pointer went missing (unlikely).

I've ran systemd-bootchart on plenty of systems which boot in less than 100ms userspace, so 653ms isn't necessarily too short for bootchart, at all.

No idea why this is happening, though. I don't know what mkosi creates either (I'm not a debian user).

sofar avatar Oct 05 '16 16:10 sofar

My laptop shows this problem too and it boots from grub to dracut to systemd-bootchart. The mkosi image is just a relatively simple way to reproduce.

Fwiw, the assert happens relatively fast on my laptop that is slower to boot. There, the assert happens almost immedistely, but the boot takes about 15 seconds to complete. So something seems to be inducing bootchart to finish collecting samples too early.

On 5 Oct 2016 1:14 p.m., "Auke Kok" [email protected] wrote:

Hmm indeed, the assert fails because nothing has been logged(likely), or the list head pointer went missing (unlikely).

I've ran systemd-bootchart on plenty of systems which boot in less than 100ms userspace, so 653ms isn't necessarily too short for bootchart, at all.

No idea why this is happening, though. I don't know what mkosi creates either (I'm not a debian user).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/systemd/systemd-bootchart/issues/20#issuecomment-251722534, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQsHUWBZGseYsn8oDo0CUFkPipfB0Orks5qw8zzgaJpZM4KM43p .

fsateler avatar Oct 05 '16 16:10 fsateler

@sofar any ideas where I can place some logging printfs to decide where the problem is? BTW, I have sometimes seen that the svg is in fact produced, but it contains only 500 samples (that should be around 0.02 seconds). My computer does not boot that fast :wink:

fsateler avatar Oct 26 '16 00:10 fsateler

@fsateler what sample rate are you using? the default is to log only 25 samples per second - 500 samples results in 20seconds logging total.

sofar avatar Oct 26 '16 04:10 sofar

huh, I guess I got confused by the '25.000' formatting of the graph info. Attaching graph now instead of describing.

bootchart-20161025-2006.svg.txt

fsateler avatar Oct 26 '16 13:10 fsateler

OK, there's something going badly wrong. There's only a single sample visible in the graph/data, but the code claims that 500 samples were recorded. That means that for 499 samples, it didn't see any recordable data. I'm wondering if you're lacking some of the required PROCFS kernel options, or that bootchart is in some other way limited from reading the necessary proc files for most of the time.

Without a full strace of it running at boot time, it'll be difficult to figure out what is the issue...

Maybe try running bootchart manually after it has run (with -r) and seeing if it creates valid output?

sofar avatar Oct 26 '16 17:10 sofar

The command works as expected when run from within a running session. I'm going to try to attach an strace to the init bootchart on the next reboot. bootchart-20161026-1434.txt

fsateler avatar Oct 26 '16 18:10 fsateler

I have been unsuccessful in running bootchart under strace. Any hints on how to do so?

fsateler avatar Nov 06 '16 21:11 fsateler

The most reliable way to do so is to make a special script that mounts your rootfs rw, and then wraps bootchart with strace while also executing /sbin/init.

something like this: untested

#!/bin/sh
mount -o remount,rw /
strace -o /strace.dump -f -tt /usr/lib/systemd/systemd-bootchart &
exec /sbin/init

sofar avatar Nov 06 '16 21:11 sofar

Nice, now I get the strace. strace.txt

fsateler avatar Nov 06 '16 22:11 fsateler

what's the content of /etc/systemd/bootchart.conf?

sofar avatar Nov 06 '16 23:11 sofar

It's all commented out:

#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.
#
# Entries in this file show the compile time defaults.
# You can change settings by editing this file.
# Defaults can be restored by simply deleting this file.
#
# See bootchart.conf(5) for details.

[Bootchart]
#Samples=500
#Frequency=25
#Relative=no
#Filter=yes
#Output=<folder name, defaults to /run/log>
#Init=/path/to/init-binary
#PlotMemoryUsage=no
#PlotEntropyGraph=no
#ScaleX=100
#ScaleY=20
#ControlGroup=no
#PerCPU=no

fsateler avatar Nov 07 '16 00:11 fsateler

I wonder if 395d046 is possibly related. Can you test this commit?

sofar avatar Apr 05 '17 17:04 sofar

I have run this, and on my laptop the assert is gone, but collection still appears to stop after 1 sample.

bootchart.txt strace.txt

fsateler avatar Apr 06 '17 22:04 fsateler

Hmm, it appears I uploaded an older bootchart. This is the one. bootchart.txt

fsateler avatar Apr 06 '17 23:04 fsateler