syzkaller
syzkaller copied to clipboard
syz-manager, dashboard: support collection of kdump cores
Linux kernel supports kdump: https://www.kernel.org/doc/Documentation/kdump/kdump.txt which can be used to do memory dump and send it over network. Manager could collect kdump dumps for crashes, esp for bugs without reproducers. We need to figure out how to set it all up, build kdump kernel, pack it into image, make kdump kernel send code back to manager. Dumping must also be somehow limited, because saving cores for all crashes will take unreasonable amount of space. That this needs to be integrated with dashboard as well Should start with making it work on gce first.
With kdump we probably also want to capture vmlinux (?).
And we could also run foreach bt -s -l
which should give stack traces of all tasks (?).
FTR, reply from mailing list (captures main things that need to be done):
It's just that usually fully automating something is much larger
amount of work than doing it manually as a one-off thing. I also need
to figure out how much time and space it takes to reboot into kdump
kernel and extract the dump. I don't think that it's feasible to
persistently store all kdumps, because we are getting ~1 crash/sec.
Then, the web server that serves syzbot UI and sends emails is an
Appengine web app which does not have direct access to test machines
and/or git, but it seems that only it can decide when we need to store
dumps persistently. In the current architecture test machines are
disposable and are long gone by the time crash it uploaded to
dashboard. So machines needs to be preserved until after dashboard
says if we need dump or not. Or maybe extract dumps always and store
them locally temporary until we know if we need to persist it or not.
I don't know yet what will work better. This also needs to be
carefully treated through crash reproduction process which has
different logic from main testing loop. And at the end interfaces
between multiple systems need to be extended, database format needs to
be extended, lots of testing done, and we need to figure out what is a
good config for kdump kernel and image build process needs to be
extended to package kdump kernel, configs of multiple systems need to
be extended and probably a bunch of other small things here and there.
Then we also need vmlinux to make dumps actionable, right? And vmlinux
is nice in itself because it allows to do objdump -d. So it probably
makes sense to separate vmlinux uploading and persistance from dumps,
because vmlinux'es probably better be uploaded once per kernel build
(which is like once per day). So that will be separate paths through
the system.
Also probably makes sense to consider if
https://github.com/google/syzkaller/issues/466 can be bundled with
this work (at least data paths, what exactly is captured can of course
be extended later).
We also need to figure out if at least part of all this can be
unit-tested and write tests.
So, yes, nothing extraordinary. But I feel this is not doable within a
day and will preferably require several uninterrupted days with
nothing else urgent, but I am having troubles with such days lately...
Can make sense to consider with #466
Another possibility is using CONFIG_PVPANIC=y
to dump as much info as possible on every crash (e.g. traceback of all CPUs, etc). Not a complete replacement for dumps, but easier to do:
https://groups.google.com/d/msg/syzkaller/mbUS_75ExMI/pztWbmxICAAJ
https://groups.google.com/d/msg/syzkaller-bugs/uVQAs3IczrE/VM6HJgU5AQAJ
The kdump-tools
or linux-crashdump
packages may be useful for this:
https://www.thegeekstuff.com/2014/05/kdump/
https://packages.ubuntu.com/xenial/devel/linux-crashdump
Maybe we don't need to build own kdump kernel and just use the package-provided one. However, the question is if kdump kernel version needs to match the main kernel or not (e.g. old distro provided kernel won't be able to properly dump everything about the newest kernel).
kdump-tools supports several locations to save the dump: local disk, nfs, ftp (?). Need to figure out what to use and how to glue it all together.
For a while now I've been using an instance where FreeBSD guests are configured to dump core over the network when they crash. This has proven extremely useful for rare crashes that don't get a reproducer. It also permits automated analysis of crash dumps. There are two issues that I've observed, neither of which I have fully solved:
-
The VM manager needs to wait for the netdump to complete before restarting the VM. For now I've just bumped waitForOutputTimeout to one minute. Ideally we would have an OS-dependent function to monitor console output following a panic and return once the dump is complete.
-
The agent which receives a kernel dump has no good way to tie it back to a syzkaller report. We can't use the panic string as an identifier since syzkaller transforms the panic string in various ways, and of course one can have distinct bugs with the same panic message. This is not a dealbreaker since one can debug without a reproducer, but it would be quite nice to be able to package kernel crash dumps and syzkaller reproducers together whenever possible.
I've done some work towards 1), but any thoughts or suggestions on 2) would be appreciated.
If we do something for this, it would be good if it can be extended to other OSes and other potential schemes for collecting dumps. It's not possible to accommodate and identify all requirements (even may be damaging), but at least we need to think about possibilities and try to accommodate immediate needs, which is for me Linux+GCE.
I know that qemu can collect dumps, which is nice and is probably easier to incorporate and should work more reliably. However, GCE does not have a comparable feature. So it's probably better to rely on kernel to collect dumps rather than VMM. But additional bonus points if the architecture supports asking VMM for the dump instead of the kernel.
Is it possible to configure FreeBSD to dump the core only if asked to? On syzbot we are getting tremendous amount of crashes and 99.9% of them are dump. We obviously can throw away dumps later, but ideally we not collect/send them at all. I know that Linux kdump effectively just reboots into another kernel preserving memory of the old one and then one can do anything. That is the natural point to ask manager if it needs the dump, or wait for manager asking for dump itself. Is something like this possible for FreeBSD? Regardless of this, I think the architecture should be such that manager first decides if it needs the dump or not and then if it needs it, it collects it. Even if the actual implementation for FreeBSD will be such that it streams the dump unconditionally, then waits for the decision and then discards the dump if it's not needed.
Other than that, adding support for dumps would be great. Once we have it in the manager, we can work extending support to syzbot (uploading few dumps per bug to GCS and providing links).
The agent which receives a kernel dump has no good way to tie it back to a syzkaller report.
I would say this is very important. If we will provide a wrong dump from time to time, it will suck. According to Murphy's law, we will sure get the mismatch when syzbot users will first resort to debugging the dump for a tricky bug.
I don't know what are the means and controls available, and what is the protocol. Is it possible to give a kernel a unique ID? Is it possible to make different machines send dumps to different addresses? E.g. if we have 10 VMs, we create 10 ports and give each machine own unique one.
I think the two requirements I listed are probably universal. From your notes above we also have:
- Need a way to limit space consumed by crash dumps. (As a point of reference, in my setup the kernel creates zstd-compressed dumps that average 16MB each. This is with VMs that run with 1GB of RAM.)
- Need a way to package the kernel image and debug info together with a crash.
Is it possible to configure FreeBSD to dump the core only if asked to?
Yes. Basically, from the kernel debugger you can write, "netdump -s 169.254.0.1" to transmit a dump to an agent (netdumpd) listening on 169.254.0.1. This could be done from vmimpl.DiagnoseFreeBSD(). I can't think of a good way to request this from the VMM.
I don't know what are the means and controls available, and what is the protocol.
We use a TFTP-like protocol called netdump, over UDP. It is very simple and not really extensible. We include a kernel dump header which contains various metadata (e.g., the panic string).
Is it possible to give a kernel a unique ID? Is it possible to make different machines send dumps to different addresses?
We have a notion of the "host ID", which is a UUID. I believe we can configure the VMs such that a new host ID is generated on every boot, and after a crash we can recover the host ID from the kernel dump. With this, would it be necessary to send dumps to a unique address? That would be a bit complicated to set up.
But if we can do "netdump -s 169.254.0.1", this seems to be everything we need (assuming it's possible to specify a port as well). Then we just dump each VM to a unique port. Anything I am missing?
Actually it is currently not possible to specify the port, but I can change that pretty easily if needed.
However, I don't think even that is necessary: netdumpd can be configured to run a script after a kernel dump completes, and one of the parameters to that script is the source IP address. So, it should be easy to use that to notify syz-manager that the dump is completed.
If we can figure out IP addresses for the VMs, then it sounds like a plan.
A thread on how to setup KDUMP kernel: Help needed in getting kernel dump in QEMU VM https://groups.google.com/g/syzkaller/c/djTdBlMrxi0
@dvyukov Some further information from my side: if you installed a Debian system on the disk other than the debootstrap system, it is easy to collect the kernel dump when you configure kdump following my configuration. However, in this situation, you will lose the "-kernel" option to start the kernel version arbitrarily. If I make mistakes or miss something, please let me know.
@markjdb Can you collect kernel dump in the Freebsd instance? If yes, which environment are you in? QEMU VM, GCE or others?
@mudongliang I am not quite sure what you are asking. In general it is pretty trivial to set up kernel dumps for FreeBSD, but our mechanism works differently from KDUMP. I've been able to collect kernel dumps in various virtualized environments, including QEMU and GCE, sure.
@markjdb thanks for your reply. I have already found a way to generate kernel dump in QEMU.
Regarding kdump collection for GCE.
One other option (in my view, most secure and robust) is to create for each VM an additional persistent disk with the size equal to the RAM size of the VM. During crash, the kexec kernel will just dd
all its memory to the disk as raw bytes, no file system is needed in this case. Then, if the dump turns out to be necessary (e.g. the dashboard was asked and replied yes), syz-manager can download the content of the disk and minimize it via makedumpfile
.
But there's a problem -- it seems that downloading a persistent disk is not so straightforward in GCP. One has to convert it to an image first and then download the image. The conversion is possible only once in 10 minutes.
Maybe we could boot another VM with that disk to extract contents? Wonder if there is disk hot plug... then we could just attach it to the syz-manager machine.
However, sending the dump to the manager with netcat looks comparable in complexity. Manager could open a separate port for each VM, then we can match dumps to VMs based on the port number. Manager can also reject the dump by closing the connection.