LibAFL icon indicating copy to clipboard operation
LibAFL copied to clipboard

Fast snapshots randomly fail upon restoration in QEMU system mode

Open saruman9 opened this issue 1 month ago • 2 comments

Describe the bug

When using fast snapshots (syx snapshots) in QEMU system mode to restore OS execution context (GNU/Linux), the snapshot restoration frequently fails and corrupts the OS memory. The issue occurs randomly in most snapshot attempts, making reliable state restoration difficult.

To Reproduce

Steps to reproduce the behavior:

  1. Set up LibAFL fuzzing with QEMU in system mode
  2. Configure fast snapshots
  3. Repeatedly create and restore snapshots during fuzzing
  4. Observe that in most cases, the restored OS state is corrupted

Expected behavior

Fast snapshots should consistently restore the OS execution context without memory corruption. The restore_fast_snapshot() function should reliably return the system to a valid state.

Additional context

  • Through experimentation, I found that calling check_fast_snapshot() after restore_fast_snapshot() can verify snapshot integrity
  • The workaround involves creating a loop that repeatedly takes and restores snapshots until check_fast_snapshot() reports no inconsistencies
  • Currently, accessing the nb_page_inconsistencies field from QemuSnapshotCheckResult is not possible as the field is private
  • A potential fix would involve making this field accessible (either public or via a getter method) to enable snapshot validation

QEMU launch parameters:

-cpu max \
-icount auto \
-m 8G \
-L /usr/share/seabios/ \
-L /usr/share/qemu/ \
-L /usr/lib/ipxe/qemu/ \
-uuid aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee \
-drive if=ide,media=disk,id=main_drive,file=./image.qcow2,format=qcow2 \
-netdev user,id=mgmt,net=10.0.2.0/24,restrict=on \
-device e1000,netdev=mgmt,id=mgmt-dev \
-netdev user,id=outside,net=10.0.3.0/24,restrict=on,hostfwd=tcp::8443-:443 \
-device e1000,netdev=outside,id=outside-dev \
-S \
-serial telnet::30007,server=on,wait=off \
-serial mon:telnet::5444,server=on,wait=off \
-nodefaults \
-vga none \
-nographic

Device list:

timer, slirp, slirp, cpu_common, cpu, kvm-tpr-opt, apic, 0000:00:00.0/I440FX, PCIHost, PCIBUS, fw_cfg, dma, dma, mc146818rtc, 0000:00:01.1/ide, i2c_bus, 0000:00:01.3/piix4_pm, 0000:00:01.0/PIIX3, i8259, i8259, ioapic, hpet, i8254, pcspk, serial, serial, fdc, ps2kbd, ps2mouse, pckbd, vmmouse, port92, smbus-eeprom, smbus-eeprom, smbus-eeprom, smbus-eeprom, smbus-eeprom, smbus-eeprom, smbus-eeprom, smbus-eeprom, 0000:00:02.0/e1000, 0000:00:03.0/e1000, acpi_build

saruman9 avatar Oct 31 '25 19:10 saruman9

@domenukk if it's still not done yet , can u assign me that issue ?

Rohankaf avatar Nov 26 '25 13:11 Rohankaf

We don't assign issues but PRs welcome

domenukk avatar Nov 27 '25 00:11 domenukk