operations icon indicating copy to clipboard operation
operations copied to clipboard

Failing Disk in Errol

Open Firefishy opened this issue 3 years ago • 6 comments

Failing disks in error need replacing:

Device: /dev/sg2 [areca_disk#05_enc#01], Self-Test Log error count increased from 2 to 3

Device info:
WDC WD2003FYYS-02W0B0, S/N:WD-WMAY00506800, WWN:5-0014ee-0ad1ed740, FW:01.01D01, 2.00 TB

Firefishy avatar Jul 25 '22 00:07 Firefishy

See also #677 about replacing errol completely

grischard avatar Jul 29 '22 16:07 grischard

The controller battery has also failed in the system.

Firefishy avatar Aug 09 '22 14:08 Firefishy

https://gist.github.com/Firefishy/881aed7b6f9e9e4378f75c9256211643

Firefishy avatar Aug 13 '22 00:08 Firefishy

Tasks:

  • [x] 1. Order replacement RAID controller battery Areca arc-6120BA
  • [x] 2. One-off tar backup of all users to norbert.
  • [x] 3. Offline (single user mode) run xfs_repair
  • [x] 4. Manually Trigger RAID Controller RAID5 patrol read + repair to force remap bad sectors (across disks)
  • [ ] 5. Replace RAID controller battery during site visit.
  • [ ] 6. Replace Failing Disk (2TB spinning rust or entry level 2TB SDD) during site visit.

Firefishy avatar Aug 13 '22 23:08 Firefishy

Option to Manually trigger patrol + repair (via archttp64) Screenshot 2022-08-14 at 04 30 36

Firefishy avatar Aug 14 '22 01:08 Firefishy

tar completed backup of all the users except: ant, harrywood and mvexel for who limited backups were created because they have millions of tiny cache files which would have significantly delayed completion.

Firefishy avatar Aug 14 '22 13:08 Firefishy

Disk still causing minor issues, but fixed the system crashing. Th areca controller was not compatible with Ubuntu 22.04 kernel (5.15), reverted to Ubuntu 20.04 kernel (5.13) and the crashes have stopped.

Firefishy avatar Sep 04 '22 17:09 Firefishy

Not going to replace disk. Faffy server is now up and will replace errol shortly.

Firefishy avatar Sep 26 '22 13:09 Firefishy