operations
operations copied to clipboard
Failing disk in sarel
The management server, sarel, is reporting a SMART failure:
This message was generated by the smartd daemon running on:
host name: sarel
DNS domain: openstreetmap.org
The following warning/error was logged by the smartd daemon:
Device: /dev/sg1 [cciss_disk_02] [SCSI], SMART Failure: HARDWARE IMPENDING FAILURE GENERAL HARD DRIVE FAILURE
Device info:
[HP EG0146FAWHU HPDG], lu id: 0x5000c5001d8ab507, S/N: 3SD2XT4X00009032QGC9, 146 GB
For details see host's SYSLOG.
You can also use the smartctl utility for further investigation.
The original message about this issue was sent at Sat Nov 13 10:49:32 2021 UTC
Another message will be sent in 24 hours if the problem persists.
Disk has now been marked as failed by the RAID controller and I think has substituted by a spare:
This is a RAID status update from cciss-vol-statusd. The cciss_vol_status
program reports that one of the RAIDs changed state:
/dev/sda: (Smart Array P410i) RAID 6 Volume 0 status: OK. At least one spare drive designated. At least one activated on-line spare drive is completely rebuilt on this logical drive. At least one spare drive activated.
Failed drives:
connector 1I box 1 bay 3 HP EG0146FAWHU 3SD2XT4X00009032QGC9 HPDG
Drives currently substituted for by spares:
connector 2I box 1 bay 5 HP EG0146FAWHU 6SD23XWK0000B123HTV5 HPDE
Total of 1 failed physical drives detected on this logical drive.
Report from /usr/local/bin/cciss-vol-statusd on sarel.openstreetmap.org
Array appears to be RAID6 over four drives with a fifth that was spare until now if I'm reading things right, though ssacli still shows that fifth disk as spare right now?