pikvm
pikvm copied to clipboard
File cache corruption related to HDMI
Describe the bug I noticed host systems not booting from USB MSD Gadgets because the USB Device seams to put out wrong data blocks. After some days of investigating we found the following behavior:
- Files that are in the page/buffer cache for the kernel get silently corrupted.
- corruption seams to be related to rebooting the Host
- big files like Ubuntu or grml iso are more prone to this as they offer more target space
- filesystems are all mounted read only
- echo 3 > /proc/sys/vm/drop_caches fixes the corrupted files
- that implies the memory will be overwritten by someone outside the pagecache / vfs layer ?
- if I disconnect HDMI form the pikvm / host system it wont happen
- Setup Pi4 B 2GB + PiKVM v3 HAT, backpower Jumper on/off doesn't make a difference
- I see this on all 3 PiKVMs we use with the latest image and also the oldest I found from 11/2023
- corruption pattern in the files:
- error pattern is 4k aligned (0x1000)
- mostly 256 Byte block of 0x80 0xEA
- sometimes 256 Byte block of 0x80 0xxx
- until now 1-5 256 Byte Block in a 1.4GB File
To Reproduce Steps to reproduce the behavior, like:
- Upload a big File
- select the file in MSD as Flash/RO
- run watch -n1 sha256sum /var/lib/kvmd/msd/$file
- Connect HDMI to DuT / PiKVM
- reboot the DuT until the checksum changes
- disconnect HDMI
- echo 3 > /proc/sys/vm/drop_caches
- checksum is correct again
- reboot the DuT several times -> checksum doesn't change
Expected behavior Checksum doesn't change.
Screenshots
[
]((https://github.com/pikvm/pikvm/assets/7650149/4808bd6f-e54f-4705-9ad7-3acff9b36358)
Desktop (please complete the following information):
- OS: [Ubuntu]
- Browser [chrome]
- Version [122.0.6261.128]
PiKVM info:
- Raspberry Pi board version [RPi 4 B 2GB]
- PiKVM platform [v3-hdmi]
- Video capture type [CSI bridge]
- KVMD version:
3.333 - uStreamer version:
6.11 - Linux kernel:
6.6.21-4-rpi
Very weird.
- Do you have Pi 4 with 4 Gb RAM? I'm wondering if this problem will be reproduced there, because different models have different mapping of memory regions.
- Please check different resolutions besides 1080p, for example 720p, 1024x768. Does this problem always reproduce?
Sup?