rasdaemon icon indicating copy to clipboard operation
rasdaemon copied to clipboard

rasdaemon: ras-mc-ctl --error-count random sorts output

Open walterav1984 opened this issue 1 year ago • 0 comments

Noticed this random sorting behavior of dimm numbers and channel/riser locations for a while but also verified git master version f9cb13b of 2024-02-05 which shows the same behavior. Not sure if its bug or a feature, but this happens on all tested machines (DQ57TM and MacPro 1,1 2,1 3,1) running repository versions 0.68 of debian 12 /ubuntu 23.10 and f9cb13b.

Its independent of the fact labels are used and or registered.

$ sudo ras-mc-ctl --error-count #1
Label   	CE	UE
DIMM2_RA	0	0
DIMM2_RB	0	0
DIMM1_RB	0	0
DIMM1_RA	0	0
DIMM4_RA	0	0
DIMM4_RB	0	0
DIMM3_RB	0	0
DIMM3_RA	0	0
$ sudo ras-mc-ctl --error-count #2
Label   	CE	UE
DIMM1_RB	0	0
DIMM4_RA	0	0
DIMM1_RA	0	0
DIMM4_RB	0	0
DIMM3_RA	0	0
DIMM2_RB	0	0
DIMM3_RB	0	0
DIMM2_RA	0	0
$ sudo ras-mc-ctl --error-count #3
Label   	CE	UE
DIMM3_RA	0	0
DIMM1_RB	0	0
DIMM2_RB	0	0
DIMM4_RA	0	0
DIMM1_RA	0	0
DIMM3_RB	0	0
DIMM4_RB	0	0
DIMM2_RA	0	0
$ sudo ras-mc-ctl --error-count #4
Label   	CE	UE
DIMM2_RA	0	0
DIMM4_RB	0	0
DIMM3_RA	0	0
DIMM1_RB	0	0
DIMM3_RB	0	0
DIMM1_RA	0	0
DIMM2_RB	0	0
DIMM4_RA	0	0

$ sudo ras-mc-ctl --error-count | sort
DIMM1_RA	0	0
DIMM1_RB	0	0
DIMM2_RA	0	0
DIMM2_RB	0	0
DIMM3_RA	0	0
DIMM3_RB	0	0
DIMM4_RA	0	0
DIMM4_RB	0	0
Label   	CE	UE

$ sudo ras-mc-ctl --error-count #1
Label                      	CE	UE
mc#0branch#0channel#1slot#0	0	0
mc#0branch#1channel#0slot#0	0	0
mc#0branch#0channel#0slot#1	0	0
mc#0branch#1channel#1slot#1	0	0
mc#0branch#1channel#1slot#0	0	0
mc#0branch#0channel#0slot#0	0	0
mc#0branch#1channel#0slot#1	0	0
mc#0branch#0channel#1slot#1	0	0

$ sudo ras-mc-ctl --error-count #2
Label                      	CE	UE
mc#0branch#1channel#0slot#1	0	0
mc#0branch#0channel#0slot#0	0	0
mc#0branch#0channel#0slot#1	0	0
mc#0branch#1channel#0slot#0	0	0
mc#0branch#1channel#1slot#1	0	0
mc#0branch#0channel#1slot#0	0	0
mc#0branch#0channel#1slot#1	0	0
mc#0branch#1channel#1slot#0	0	0

$ sudo ras-mc-ctl --error-count #3
Label                      	CE	UE
mc#0branch#0channel#1slot#1	0	0
mc#0branch#1channel#0slot#0	0	0
mc#0branch#0channel#0slot#1	0	0
mc#0branch#1channel#1slot#0	0	0
mc#0branch#1channel#1slot#1	0	0
mc#0branch#0channel#0slot#0	0	0
mc#0branch#1channel#0slot#1	0	0
mc#0branch#0channel#1slot#0	0	0

$ sudo ras-mc-ctl --error-count #4
Label                      	CE	UE
mc#0branch#0channel#0slot#0	0	0
mc#0branch#1channel#0slot#0	0	0
mc#0branch#1channel#1slot#0	0	0
mc#0branch#0channel#1slot#0	0	0
mc#0branch#1channel#0slot#1	0	0
mc#0branch#0channel#0slot#1	0	0
mc#0branch#0channel#1slot#1	0	0
mc#0branch#1channel#1slot#1	0	0

Compared to --guess-labels and --print-labels which use their own unique but fixed pattern.

$ sudo ras-mc-ctl --guess-labels
memory stick 'DIMM 1' is located at 'DIMM Riser A'
memory stick 'DIMM 2' is located at 'DIMM Riser A'
memory stick 'DIMM 1' is located at 'DIMM Riser B'
memory stick 'DIMM 2' is located at 'DIMM Riser B'
memory stick 'DIMM 3' is located at 'DIMM Riser A'
memory stick 'DIMM 4' is located at 'DIMM Riser A'
memory stick 'DIMM 3' is located at 'DIMM Riser B'
memory stick 'DIMM 4' is located at 'DIMM Riser B'

$ sudo ras-mc-ctl --print-labels #edited labels but not registered yet
LOCATION                            CONFIGURED LABEL     SYSFS CONTENTS      
mc0 branch 0 channel 0 slot 0       DIMM1_RA             mc#0branch#0channel#0slot#0
mc0 branch 0 channel 0 slot 1       DIMM3_RA             mc#0branch#0channel#0slot#1
mc0 branch 0 channel 1 slot 0       DIMM2_RA             mc#0branch#0channel#1slot#0
mc0 branch 0 channel 1 slot 1       DIMM4_RA             mc#0branch#0channel#1slot#1
mc0 branch 1 channel 0 slot 0       DIMM1_RB             mc#0branch#1channel#0slot#0
mc0 branch 1 channel 0 slot 1       DIMM3_RB             mc#0branch#1channel#0slot#1
mc0 branch 1 channel 1 slot 0       DIMM2_RB             mc#0branch#1channel#1slot#0
mc0 branch 1 channel 1 slot 1       DIMM4_RB             mc#0branch#1channel#1slot#1

$ sudo ras-mc-ctl --register-labels
$ sudo ras-mc-ctl --print-labels
LOCATION                            CONFIGURED LABEL     SYSFS CONTENTS      
mc0 branch 0 channel 0 slot 0       DIMM1_RA             DIMM1_RA            
mc0 branch 0 channel 0 slot 1       DIMM3_RA             DIMM3_RA            
mc0 branch 0 channel 1 slot 0       DIMM2_RA             DIMM2_RA            
mc0 branch 0 channel 1 slot 1       DIMM4_RA             DIMM4_RA            
mc0 branch 1 channel 0 slot 0       DIMM1_RB             DIMM1_RB            
mc0 branch 1 channel 0 slot 1       DIMM3_RB             DIMM3_RB            
mc0 branch 1 channel 1 slot 0       DIMM2_RB             DIMM2_RB            
mc0 branch 1 channel 1 slot 1       DIMM4_RB             DIMM4_RB

walterav1984 avatar Feb 21 '24 11:02 walterav1984