rpi-imager icon indicating copy to clipboard operation
rpi-imager copied to clipboard

[BUG]: Page cache usage causes post-write sync to fail

Open P33M opened this issue 3 months ago • 2 comments

What happened?

This bug is specific to the Linux variant of Imager. When writing an image to a storage device that is also a target for a device-specific synchronisation operation, the sync op can time out with various bad effects. A prerequisite is that the destination storage has slower write speed than the read/decompress operation speed (typical for most target SD cards).

Here are two reproducers:

  1. Pi 5 8GB running Pi OS booted from USB and writing an image to SD

Insert a blank SD class A1 card in the SD slot and use imager in CLI mode to write it.

While writing, the buffers/page cache usage reported will climb to use the vast majority of the free RAM. At or near the 100% step in the progress bar, the sync op is issued, takes more than 2 minutes to complete, and this causes a splat in dmesg:

[  726.451032] INFO: task kworker/1:0:2147 blocked for more than 120 seconds.
[  726.451043]       Not tainted 6.12.47-v8-16k+ #635
[  726.451046] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  726.451048] task:kworker/1:0     state:D stack:0     pid:2147  tgid:2147  ppid:2      flags:0x00000008
[  726.451056] Workqueue: events_freezable mmc_rescan
[  726.451067] Call trace:
[  726.451069]  __switch_to+0xf0/0x160
[  726.451076]  __schedule+0x330/0xb68
[  726.451080]  schedule+0x3c/0x148
[  726.451084]  __mmc_claim_host+0xbc/0x1f0
[  726.451088]  mmc_get_card+0x3c/0x58
[  726.451093]  mmc_sd_detect+0x28/0xa0
[  726.451097]  mmc_rescan+0x94/0x330
[  726.451101]  process_one_work+0x15c/0x3c0
[  726.451107]  worker_thread+0x2e4/0x3f0
[  726.451111]  kthread+0x120/0x130
[  726.451115]  ret_from_fork+0x10/0x20
[  811.206493]  mmcblk0: p1 p2
[  811.282818]  mmcblk0: p1 p2

The process eventually succeeds (the final write to the partition table causes a re-enumeration).

  1. Writing to a Pi exposing storage via USB mass-storage gadget

A different variation of this is seen with a Pi 4/5 4GB running Pi OS, and writing to a Pi 4/5 exposing its SD card via mass-storage gadget. The dirty page counts on the gadget Pi climb to a significant fraction of total RAM. In this case, the sync op on the mass-storage interface times out and causes Linux to do a device reset, which is badly handled by the gadget.

Imager should have some notion of synchronously writing to/checkpointing writes to the underlying block device in both cases - avoiding buffer bloat which also causes the progress bar to be quite inaccurate.

Version

1.9.6 (Default)

What host operating system were you using?

Debian and derivatives (eg Ubuntu)

Host OS Version

Raspberry Pi OS bookworm

Selected OS

Raspberry Pi OS bookworm

Which Raspberry Pi Device are you using?

Raspberry Pi 5, 500, and Compute Modules 5

What kind of storage device are you using?

Other

OS Customisation

  • [ ] Yes, I was using OS Customisation when the bug occurred.

Relevant log output


P33M avatar Sep 23 '25 10:09 P33M

Bug report accepted, scheduled for 2.0.

tdewey-rpi avatar Sep 23 '25 13:09 tdewey-rpi

2.0 will introduce an adaptive pending-write window. Devices with more RAM that may just be suffering bus contention will get a 256MiB write window or 7 second hard limit, before an enforced sync. Devices with less RAM will get a 16MiB write window, or a 3 second hard limit, before an enforced sync.

This mechanism will be applied across Windows, macOS and Linux for consistency - though we've only observed this on Linux, it's certainly plausible on the other OSes.

tdewey-rpi avatar Sep 23 '25 14:09 tdewey-rpi

This appears to be resolved as of 2.0 rc7 (not using the exact failing flash storage as last time, but a device that's similarly slow to write)

P33M avatar Nov 05 '25 14:11 P33M

Thanks for the confirmation, @P33M. Closing as fixed, with 2.x-series releases.

tdewey-rpi avatar Nov 05 '25 16:11 tdewey-rpi