linux
linux copied to clipboard
Unable to get ramoops working
Describe the bug
No data are written to /sys/fs/pstore/ when a test kernel panic is triggered.
Steps to reproduce the behaviour
- Add
dtoverlay=ramoopsto/boot/config.txtand reboot echo 10 > /proc/sys/kernel/panicecho c > /proc/sysrq-trigger
The kernel panics but no data are written to /sys/fs/pstore/, upon rebooting:
# tree /var/lib/systemd/pstore/ /sys/fs/pstore
/var/lib/systemd/pstore/
/sys/fs/pstore
0 directories, 0 files
Device (s)
Raspberry Pi 4 Mod. B
System
- Arch Linux ARM armv7h
5.15.84-1-rpi-ARCH #1 SMP Mon Dec 19 13:37:50 MST 2022 armv7l GNU/Linux
% vcgencmd version
Dec 12 2022 11:56:56
Copyright (c) 2012 Broadcom
version ed6f6b8fcdc6476410b9cf75d141633461d34bdd (clean) (release) (start)
Logs
No response
Additional context
# cat /boot/config.txt
display_auto_detect=1
dtoverlay=ramoops
arm_boost=1
% zgrep PSTORE /proc/config.gz
CONFIG_PSTORE=y
CONFIG_PSTORE_DEFAULT_KMSG_BYTES=10240
CONFIG_PSTORE_DEFLATE_COMPRESS=y
# CONFIG_PSTORE_LZO_COMPRESS is not set
# CONFIG_PSTORE_LZ4_COMPRESS is not set
# CONFIG_PSTORE_LZ4HC_COMPRESS is not set
# CONFIG_PSTORE_842_COMPRESS is not set
# CONFIG_PSTORE_ZSTD_COMPRESS is not set
CONFIG_PSTORE_COMPRESS=y
CONFIG_PSTORE_DEFLATE_COMPRESS_DEFAULT=y
CONFIG_PSTORE_COMPRESS_DEFAULT="deflate"
CONFIG_PSTORE_CONSOLE=y
# CONFIG_PSTORE_PMSG is not set
CONFIG_PSTORE_RAM=y
# CONFIG_PSTORE_BLK is not set
% dmesg | grep -E 'pstore|ramoops'
[ +0.000371] pstore: Registered ramoops as persistent store backend
[ +0.000022] ramoops: using 0x10000@0xb000000, ecc: 0
[ +0.000528] pstore: Using crash dump compression: deflate
[ +0.003103] systemd[1]: Starting Load Kernel Module efi_pstore...
[ +0.001190] systemd[1]: modprobe@efi_pstore.service: Deactivated successfully.
[ +0.000439] systemd[1]: Finished Load Kernel Module efi_pstore.
[ +0.000178] systemd[1]: Platform Persistent Storage Archival was skipped because of an unmet condition check (ConditionDirectoryNotEmpty=/sys/fs/pstore).
# grep "" /sys/module/ramoops/parameters/*
/sys/module/ramoops/parameters/console_size:0
/sys/module/ramoops/parameters/dump_oops:-1
/sys/module/ramoops/parameters/ecc:0
/sys/module/ramoops/parameters/ftrace_size:0
/sys/module/ramoops/parameters/max_reason:2
/sys/module/ramoops/parameters/mem_address:184549376
/sys/module/ramoops/parameters/mem_size:65536
/sys/module/ramoops/parameters/mem_type:0
/sys/module/ramoops/parameters/pmsg_size:0
/sys/module/ramoops/parameters/record_size:16384
Make sure to disable systemd-pstore.service if you want all logs from pstore. Other ramoops collection software may work better (I haven't tested that), but at least with systemd-pstore I always lost parts of or all ramoops logs. Getting the logs yourself from /sys/fs/pstore/ is easy, but please note that you have to archive them yourself.
Just tested your command sequence on my Pi 4B with armv8 kernel 5.15.84-v8+ (Raspberry Pi OS 64-bit) and it worked fine except the usual flipped bits in the log which may cause relevant parts of the log to have misleading information.
To get reliable non-corrupt logs you may have to enable ECC for ramoops. The current dtoverlay doesn't support that. I plan to send a patch once local testing is complete.
What's the output on your system for
dmesg|grep ramoops
Also copied setup instructions from https://github.com/raspberrypi/linux/issues/5063#issuecomment-1167966601
sudo grep "" /sys/module/ramoops/parameters/*
/sys/module/ramoops/parameters/console_size:16384
/sys/module/ramoops/parameters/dump_oops:-1
/sys/module/ramoops/parameters/ecc:0
/sys/module/ramoops/parameters/ftrace_size:0
/sys/module/ramoops/parameters/max_reason:2
/sys/module/ramoops/parameters/mem_address:184549376
/sys/module/ramoops/parameters/mem_size:65536
/sys/module/ramoops/parameters/mem_type:0
/sys/module/ramoops/parameters/pmsg_size:0
/sys/module/ramoops/parameters/record_size:16384
and
dmesg|grep ramoops
[ 0.042319] printk: console [ramoops-1] enabled
[ 0.042335] pstore: Registered ramoops as persistent store backend
[ 0.042347] ramoops: using 0x10000@0xb000000, ecc: 0
Also tried the config.txt line dtoverlay=ramoops,console-size=0x4000,ecc=1,dump_oops=1 to get ecc and dump_oops working, but that didn't do anything...
(This is on the 6.1.9-v8+ kernel.)
mount | grep pstore
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
sudo ls /sys/fs/pstore/ is empty.
sudo modprobe configs
zcat /proc/config.gz | grep PSTORE
# CONFIG_EFI_VARS_PSTORE is not set
CONFIG_PSTORE=y
CONFIG_PSTORE_DEFAULT_KMSG_BYTES=10240
CONFIG_PSTORE_DEFLATE_COMPRESS=y
# CONFIG_PSTORE_LZO_COMPRESS is not set
# CONFIG_PSTORE_LZ4_COMPRESS is not set
# CONFIG_PSTORE_LZ4HC_COMPRESS is not set
# CONFIG_PSTORE_842_COMPRESS is not set
# CONFIG_PSTORE_ZSTD_COMPRESS is not set
CONFIG_PSTORE_COMPRESS=y
CONFIG_PSTORE_DEFLATE_COMPRESS_DEFAULT=y
CONFIG_PSTORE_COMPRESS_DEFAULT="deflate"
CONFIG_PSTORE_CONSOLE=y
# CONFIG_PSTORE_PMSG is not set
# CONFIG_PSTORE_FTRACE is not set
CONFIG_PSTORE_RAM=y
# CONFIG_PSTORE_BLK is not set
This also didn't help:
echo Y > /sys/module/printk/parameters/always_kmsg_dump
echo Y > /sys/module/kernel/parameters/crash_kexec_post_notifiers
Nor did setting kernel.printk = 7 3 4 1 3 in /etc/sysctl.d/98-rpi.conf ...
With the current kernel and device-tree, the ecc parameter is not functional. It will work neither in cmdline.txt nor in config.txt. That said, ramoops should still work.
I haven't tested kernel 6.1 yet, so there may be problems I'm not yet aware of. Will test after the weekend.
Even with kernel 6.1.9-v8+ from "rpi-update next" ramoops works for me. The only obvious (there may be others) remaining differences with your setup are: I'm running Raspberry Pi OS 64-bit, you're running Arch 32-bit. I have not tested if that dtoverlay even works with 32-bit kernels.
It might make sense to also check with rpi-eeprom-update which SPI bootloader you're using. Please note that after a poweroff, /sys/fs/pstore/ will be empty. Modifying ecc settings manually might also interfere with ramoops.
Not running in 32-bit. (OP is... I'm running in 64-bit mode...)
uname -m
aarch64
Not sure what rpi-eeprom-update is supposed to show?
sudo rpi-eeprom-update -a
BOOTLOADER: up to date
CURRENT: Wed 11 Jan 2023 05:40:52 PM UTC (1673458852)
LATEST: Wed 04 Jan 2023 10:27:49 AM UTC (1672828069)
RELEASE: beta (/lib/firmware/raspberrypi/bootloader/beta)
Use raspi-config to change the release.
VL805_FW: Dedicated VL805 EEPROM
VL805: up to date
CURRENT: 000138c0
LATEST: 000138c0
How are you getting the kernel messages?
These are the current module settings:
sudo grep "" /sys/module/ramoops/parameters/*
/sys/module/ramoops/parameters/console_size:16384
/sys/module/ramoops/parameters/dump_oops:-1
/sys/module/ramoops/parameters/ecc:0
/sys/module/ramoops/parameters/ftrace_size:0
/sys/module/ramoops/parameters/max_reason:2
/sys/module/ramoops/parameters/mem_address:184549376
/sys/module/ramoops/parameters/mem_size:65536
/sys/module/ramoops/parameters/mem_type:0
/sys/module/ramoops/parameters/pmsg_size:0
/sys/module/ramoops/parameters/record_size:16384
I do a soft reboot with sudo reboot and I'm not seeing anything via sudo ls /sys/fs/pstore/.
@satmandu have you made sure that systemd-pstore never runs? It is enabled by default.
After a normal reboot (no crash), I have these files: root@pi-test-1:~# ls -l /sys/fs/pstore/ total 0 -r--r--r-- 1 root root 16372 Aug 7 15:25 console-ramoops-0
The only line for ramoops in my config.txt is: dtoverlay=ramoops,console-size=0x4000
That works well for me.
@satmandu have you made sure that systemd-pstore never runs? It is enabled by default.
Thanks! That was my issue.
Just to reiterate the steps I needed to get this working with the rpi 6.1.x kernel in both rpi-os 64-bit and ubuntu:
systemctl disable systemd-pstore
# Ensure correct kernel.printk set in /etc/sysctl.d/98-rpi.conf
cat /etc/sysctl.d/98-rpi.conf
kernel.printk = 7 3 4 1 3
set config.txt to have dtoverlay=ramoops,console-size=0x4000
One could also optionally add the following to /etc/rc.local:
echo Y > /sys/module/printk/parameters/always_kmsg_dump
(Maybe at some point, you might be willing to ask the RPI folks to adjust the default kernel.printk setting? And maybe also we could try to get systemd's pstore daemon fixed too?)
Thanks so much for helping with this. @graysky2 I hope you can get it working on the 32-bit kernel...
Back then, I adjusted the kernel.printk setting because I read it somewhere on the internet. However, most kernel messages (until systemd takes over) are available even with the default kernel.printk settings. Will experiment some more because I'm going to use ramoops in my little (~600 devices) fleet in production soon.
most kernel messages (until systemd takes over) are available
The kernel command line (cmdline.txt) parameter ignore_loglevel prints all kernel messages to the console, even after systemd has started.
Hello, I am having trouble with this now on a raspberry pi 4b:
Linux mhs-por-dev-unset 6.6.20+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.6.20-1+rpt1 (2024-03-07) aarch64 GNU/Linux
I have disabled systemd-pstore.
config.txt modifications:
enable_uart=1
dtoverlay=ramoops,console-size=0x4000
cmdline.txt:
console=serial0,115200 console=tty1 root=PARTUUID=bd05bf36-02 rootfstype=ext4 fsck.repair=yes rootwait
If I do
sysctl kernel.panic=10
echo c | sudo tee /proc/sysrq-trigger
I do see a panic on the uart, but there is also an error (ENOSPC ?) and the pstore is empty after the device reboots.
[ 63.708118] sysrq: Trigger a crash
[ 63.711618] Kernel panic - not syncing: sysrq triggered crash
[ 63.717463] CPU: 3 PID: 866 Comm: tee Tainted: G C 6.6.20+rpt-rpi-v8 #1 Debian 1:6.6.20-1+rpt1
[ 63.727726] Hardware name: Raspberry Pi 4 Model B Rev 1.5 (DT)
[ 63.733658] Call trace:
[ 63.736147] dump_backtrace+0xa0/0x100
[ 63.739975] show_stack+0x20/0x38
[ 63.743352] dump_stack_lvl+0x48/0x60
[ 63.747084] dump_stack+0x18/0x28
[ 63.750460] panic+0x328/0x390
[ 63.753577] sysrq_handle_crash+0x24/0x30
[ 63.757663] __handle_sysrq+0xb8/0x1e8
[ 63.761482] write_sysrq_trigger+0x7c/0xb0
[ 63.765656] proc_reg_write+0xa4/0x100
[ 63.769480] vfs_write+0xcc/0x310
[ 63.772855] ksys_write+0x78/0x118
[ 63.776317] __arm64_sys_write+0x24/0x38
[ 63.780304] invoke_syscall+0x50/0x128
[ 63.784105] el0_svc_common.constprop.0+0x48/0xf0
[ 63.788875] do_el0_svc+0x24/0x38
[ 63.792234] el0_svc+0x40/0xe8
[ 63.795330] el0t_64_sync_handler+0x100/0x130
[ 63.799747] el0t_64_sync+0x190/0x198
[ 63.803458] SMP: stopping secondary CPUs
[ 63.807435] Kernel Offset: 0x1812a00000 from 0xffffffc080000000
[ 63.813438] PHYS_OFFSET: 0x0
[ 63.816354] CPU features: 0x0,80000201,3c020000,0000421b
[ 63.821740] Memory Limit: none
[ 63.827967] pstore: backend (ramoops) writing error (-28)
[ 63.833446] Rebooting in 10 seconds..
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd083]
If instead I do a normal sudo shutdown -r now I do see the console save in pstore: console-ramoops-0
pi@mhs-por-dev-unset:~ $ dmesg | grep pstore\\\|oops
[ 0.000000] OF: reserved mem: 0x000000000b000000..0x000000000b00ffff (64 KiB) map non-reusable ramoops@b000000
[ 0.038929] pstore: Using crash dump compression: deflate
[ 0.038953] printk: console [ramoops-1] enabled
[ 0.039434] pstore: Registered ramoops as persistent store backend
[ 0.039452] ramoops: using 0x10000@0xb000000, ecc: 0
pi@mhs-por-dev-unset:~ $ sudo grep -r . /sys/module/pstore/parameters/
/sys/module/pstore/parameters/update_ms:-1
/sys/module/pstore/parameters/kmsg_bytes:10240
/sys/module/pstore/parameters/backend:ramoops
/sys/module/pstore/parameters/compress:deflate
pi@mhs-por-dev-unset:~ $ sudo grep -r . /sys/module/ramoops/parameters/
/sys/module/ramoops/parameters/mem_address:184549376
/sys/module/ramoops/parameters/dump_oops:-1
/sys/module/ramoops/parameters/ecc:0
/sys/module/ramoops/parameters/max_reason:2
/sys/module/ramoops/parameters/record_size:16384
/sys/module/ramoops/parameters/pmsg_size:0
/sys/module/ramoops/parameters/mem_type:0
/sys/module/ramoops/parameters/mem_size:65536
/sys/module/ramoops/parameters/console_size:16384
/sys/module/ramoops/parameters/ftrace_size:0
More strangely I did see a crash dump the first time I enabled this - I had not disabled systemd-pstore, I had only added dtoverlay=ramoops, I beleive everything else was the same. I did see the ENOSPC error on the uart that time, but I did not take any more notes (because it worked).
Hi, I'm also having trouble with ramoops but only on Raspberry Pi 4B.
config.txt
enable_uart=1
dtoverlay=ramoops-pi4
confirmed that ramoops is enabled:
root@pi4btw:/sys/fs/pstore# dmesg | grep ramoops
[ 0.047397] pstore: Registered ramoops as persistent store backend
[ 0.047424] ramoops: using 0x10000@0xb000000, ecc: 0
crashed each Pis with these commands:
echo 10 > /proc/sys/kernel/panic
echo c > /proc/sysrq-trigger
I did the same testing with Pi 3B (with dtoverlay=ramoops), Pi 4B and Pi 5 and only Pi 4B fails to write on /sys/fs/pstore/. On the other hand, all Pis succeed in writing for non panic logs in /sys/fs/pstore/console-ramoops-0 when dtoverlay=ramoops,console-size=0x4000. Is this a hardware issue on Pi 4? I find it weird that it's only Pi 4 that's failing on the same software.
I tried this in the latest release of bookworm (kernel 6.6.47+rpt-rpi-v8) and bullseye (kernel 6.1.21-v8+).
By the way, Pi 5 seem to fail to automatically load ramoops-pi4.dtbo when config.txt is set to dtoverlay=ramoops.
EDIT: finished writing comment, accidentally pressed comment button before finish writing
Are you sure it never works? I sometimes get output, but on the flip side I find reboot logs equally unreliable.
With an instrumented kernel, this is the output after a few reboots:
[ 0.038638] ramoops: found existing invalid buffer, size 0, start 2097152 (43474244)
[ 0.038677] ramoops: no valid data in buffer (sig = 0x43470204)
[ 0.038707] ramoops: found existing empty buffer (43474244)
[ 0.038732] ramoops: no valid data in buffer (sig = 0x43464244)
[ 0.038675] ramoops: no valid data in buffer (sig = 0x41454264)
[ 0.038707] ramoops: no valid data in buffer (sig = 0x53470200)
[ 0.038737] ramoops: no valid data in buffer (sig = 0xc3474244)
[ 0.038762] ramoops: no valid data in buffer (sig = 0x53464244)
and after a forced panic:
[ 0.038640] ramoops: no valid data in buffer (sig = 0x41474244)
[ 0.038672] ramoops: no valid data in buffer (sig = 0x53470200)
[ 0.038702] ramoops: found existing invalid buffer, size 1073741824, start 0 (43474244)
[ 0.038733] ramoops: no valid data in buffer (sig = 0x53064244)
[
And for comparison, this is after a Pi 5 panic:
[ 0.013438] ramoops: found existing buffer, size 8602, start 8602 (43474244)
[ 0.013466] ramoops: found existing empty buffer (43474244)
[ 0.013471] ramoops: found existing empty buffer (43474244)
[ 0.013477] ramoops: found existing empty buffer (43474244)
The hex numbers in parentheses are the signatures, which should be 43474244 (DBGC). You'll see that many of the entries have one or more bit errors, which suggests that the RAM content is just not being maintained.
The thinking here is that the period of SDRAM controller calibration may stall refresh cycles for too long, but we're not sure why Pi 5 doesn't seem to suffer in the same way.
By the way, Pi 5 seem to fail to automatically load
ramoops-pi4.dtbowhen config.txt is set todtoverlay=ramoops
Indeed - that will be fixed.
And now it is - see 9557336c4fc4ac4606ac9e78c239aa689c26e870.
Are you sure it never works? I sometimes get output, but on the flip side I find reboot logs equally unreliable.
Yes, at least from my testing with Pi 4s, regular reboot seem to leave the logs quite consistently while panic logs are never there.
May I know how I can see the logs for ramoops like these?
[ 0.038640] ramoops: no valid data in buffer (sig = 0x41474244)
[ 0.038672] ramoops: no valid data in buffer (sig = 0x53470200)
[ 0.038702] ramoops: found existing invalid buffer, size 1073741824, start 0 (43474244)
[ 0.038733] ramoops: no valid data in buffer (sig = 0x53064244)
EDIT: added quotes
I'm on Raspberry Pi 4 Model B Rev 1.5 if that matters in any way.
Maybe I may not have rebooted the device enough times to make reboot ramoops unreliable.
Running sudo rpi-update pulls/6391 will install a trial kernel with the added ramoops instrumentation, which will give us a better picture of whether RAM contents are decaying.
More test results:
- Rev 1.2 4B gives the patchy bit rot results reported above
- Rev 1.4 4B seems to preserve RAM contents across a reboot well enough to populate the pstore (and thence /var/lib/systemd/pstore)
- Rev 1.5 4B trashes RAM contents - all the signatures appear as 0xffffffff
- A second rev 1.2 4B preserves the RAM contents.
- Rev 1.3 4B preserves the RAM contents.
Ignoring the results from the first rev 1.2 (of dubious provenance - it had a gold sticker on), everything prior to the rev 1.5 seems to work with ramoops. Presumably something significant changed in the transition to the dual Dialog PMICs.
If that's a Pi from my drawer, the gold sticker on string tag means "Golden Version", so should be good.
I try to stay away from your drawers
Update: I'm seeing the SDRAM power rail drop on a crash reboot, but not on a normal reboot. This is because the panic bypasses the normal kernel shutdown, including returning the SD card voltage to 3.3V and power-cycling it. This can interfere with normal booting, so the firmware plays it safe and forces a global reset. We don't understand why the RAM contents survives on the older revisions but not the 1.5, but the 1.5 does use a different PMIC.
Fortunately, 4Bs since the 1.3 have the ability to power-cycle just the SD card, meaning that on those boards the global reset can be replaced by a card power-cycle and a watchdog reset (this is what the kernel would normally do). I'll put together a test EEPROM image with that functionality and see if ramoops then works.
Make sure to disable systemd-pstore.service if you want all logs from pstore. Other ramoops collection software may work better (I haven't tested that), but at least with systemd-pstore I always lost parts of or all ramoops logs. Getting the logs yourself from /sys/fs/pstore/ is easy.
By the way, if you enable too much ECC (128 bytes or more), at least some versions of the kernel will hang on boot when parsing the crash log. ecc=64 or less is safe and helpful, though.
rpi-eeprom-recovery.zip Attached is a trial build of a Pi 4 EEPROM image that uses a soft reboot to recover from incorrect SD card voltage on boards with an SD power switch. This should make ramoops completely reliable on your rev 1.5 Pi 4.
Extract the contents onto a blank SD card, and boot with it inserted.