operations icon indicating copy to clipboard operation
operations copied to clipboard

Spike-08 has potentially faulty PSUs

Open Firefishy opened this issue 1 year ago • 6 comments

The 500W PSUs in spike-08.

Under extreme load spikes the PSUs fail.

Bay Present Status PDS Hotplug Model Spare Serial Number Capacity Firmware
1 OK Good, In Use No Yes 720478-B21 754377-001 5DLUT0C8J7R4JT 500 Watts 1.02
2 OK Good, In Use No Yes 720478-B21 754377-001 5DLUT0C8J7Q4T6 500 Watts 1.02
0139 Critical       18:50  07/19/2022 18:50  07/19/2022 0001
LOG: Server Critical Fault (Service Information: Runtime Fault, System Board,  P12V Main/AUX Regulator 1 (04h))

New 865408-B21 are known good replacements.

Disabling High Efficiency Mode reduces the occurrence of the issue.

Firefishy avatar Jul 19 '22 21:07 Firefishy

8J serial with Firmware 2.00 is known good, but a G10 I think is required to update the firmware. G9 is unable to update the firmware.

8J serial with Firmware 1.02 from experience is bad. A prime95 will cause the P12V Main/AUX Regulator issue within a few minutes.

Firefishy avatar Jul 19 '22 21:07 Firefishy

Related: https://github.com/openstreetmap/operations/issues/688

Firefishy avatar Jul 20 '22 10:07 Firefishy

The closest HPE document on the issue: https://support.hpe.com/hpesc/public/docDisplay?docId=a00050474en_us Note that 8J serial with firmware 1.02 is still not good from experience. Firmware 2.00 is OK.

Firefishy avatar Jul 20 '22 10:07 Firefishy

I have set PSUs to balanced mode, instead of High Efficiency mode which reduces the likelihood of a PSU 12V trip.

Firefishy avatar Jul 25 '22 22:07 Firefishy

spike-08 only appears to have tripped 2 times due to the PSU 12V issue. So thankfully quite rare.

Firefishy avatar Jul 27 '22 12:07 Firefishy

Replacement PSUs ordered.

Firefishy avatar Aug 30 '22 11:08 Firefishy

Swapped out.

Firefishy avatar Oct 18 '22 19:10 Firefishy