rasdaemon
rasdaemon copied to clipboard
Rasdaemon is a RAS (Reliability, Availability and Serviceability) logging tool. It records memory errors, using the EDAC tracing events. EDAC is a Linux kernel subsystem with handles detection of ECC...
- Introduction: Identify memory row faults in memory CE faults and isolate the physical memory pages where row faults occur. This method can effectively prevent CE storms or memory UCE...
1. rasdaemon: generic fixes - Fix build warnings unused variable if AMP RAS errors is not enabled - Update memory failure action page types 2. ras-mc-ctl: Add support for CXL...
It is not used and prevents ras-mc-ctl.service from starting on Fedora when SELinux is in Enforcing mode. Resolves: rhbz#1836861 Resolves: https://github.com/fedora-selinux/selinux-policy/issues/2054 Resolves: https://github.com/mchehab/rasdaemon/issues/79
The commit introducing block_rq_error tracepoint has been backported in RHEL 9.1, so improve the check for block_rq_error presence to use it.
For the Intel Corporation DQ57TM motherboard, booted in UEFI showed the right locations with `--guess-labels` but still need the following `labels/intel` addition for showing correct DIMM numbers. For slot2 and...
Noticed this random sorting behavior of dimm numbers and channel/riser locations for a while but also verified git master version [f9cb13b](https://github.com/mchehab/rasdaemon/commit/f9cb13b8643d375454df152269c3a974b6b91983) of 2024-02-05 which shows the same behavior. Not sure...
For the Apple MacPro 1,1 (Mac-F4208DC8) and MacPro 2,1 (Mac-F4208DA9) these are the correct labels for the DIMM numbers 1-4 on each DIMM Riser A&B for a total of 8...
Enable error decoding support for the newly added extended error bit descriptions from MCA_CTL_SMU. b'0:11 can be decoded from existing array smca_smu2_mce_desc. Define a function to append the newly defined...
I find myself having to delete /var/lib/rasdaemon/ras-mc_event.db manually. `ras-mc-ctl --flush-errors` would be more convenient.