CoreFreq icon indicating copy to clipboard operation
CoreFreq copied to clipboard

AMD 7950x3D Memory controller showing dual-rank DDR5 DIMMs as single-rank with half the capacity

Open cloud11665 opened this issue 8 months ago • 13 comments

output of corefreq-cli -k -n -B -n -M

Linux:                                                                          
|- Release                                                    [6.8.0-57-generic]
|- Version         [#59-Ubuntu SMP PREEMPT_DYNAMIC Sat Mar 15 17:40:59 UTC 2025]
|- Machine                                                              [x86_64]
Memory:                                                                         
|- Total RAM                                                         64956696 KB
|- Shared RAM                                                            6084 KB
|- Free RAM                                                          62851540 KB
|- Buffer RAM                                                          342592 KB
|- Total High                                                               0 KB
|- Free High                                                                0 KB
Clock Source                                                  <             tsc>
CPU-Freq driver                                               [  amd-pstate-epp]
Governor                                                      [         Missing]
CPU-Idle driver                                               [       acpi_idle]
|- Idle Limit                                                 [              C3]
   |- State        POLL      C1      C2      C3                                 
   |-           CPUIDLE ACPI FF ACPI IO ACPI IO                                 
   |- Power          -1       0       0       0                                 
   |- Latency         0       1      18     350                                 
   |- Residency       0       2      36     700                                 

[ 0] American Megatrends Inc.                                                   
[ 1] 2904                                                                       
[ 2] 03/04/2025                                                                 
[ 3] ASUS                                                                       
[ 4] System Product Name                                                        
[ 5] System Version                                                             
[ 6] S---e---e---l---m---                                                       
[ 7] SKU                                                                        
[ 8] To be filled by O.E.M.                                                     
[ 9] ASUSTeK COMPUTER INC.                                                      
[10] ProArt X670E-CREATOR WIFI                                                  
[11] Rev 1.xx                                                                   
[12] 2---1---9---7--                                                            
[13] Number Of Devices:4\Maximum Capacity:134217728 kilobytes                   
[14]                                                                            
[15] DIMM 1\P0 CHANNEL A                                                        
[16]                                                                            
[17] DIMM 1\P0 CHANNEL B                                                        
[18]                                                                            
[19] Kingston                                                                   
[20]                                                                            
[21] Kingston                                                                   
[22]                                                                            
[23] KF560C36-32                                                                
[24]                                                                            
[25] KF560C36-32                                                                

                              Zen UMC  [14E0]                              
Controller #0                                                Dual Channel  
 Bus Rate  2800 MHz       Bus Speed 2800 MHz           DDR5 Speed 5600 MT/s
                                                                           
 Cha   CL  RCDr RCDw  RP  RAS   RC  RRDs RRDl FAW  WTRs WTRl  WR  clRR clWW
  #0   36   38   38   38   96  134    4    6   16    4   20   48    5   15 
  #1   36   38   38   38   96  134    4    6   16    4   20   48    5   15 
      CWL  RTP RdWr WrRd scWW sdWW ddWW scRR sdRR ddRR drRR drWW drWR drRRD
  #0   34   22   18    6    1    9    7    1    7    8    0    0    0    0 
  #1   34   22   18    6    1    9    7    1    7    8    0    0    0    0 
      REFI RFC1 RFC2 RFCsb RCPB RPPB BGS:Alt  Ban  Page  CKE  CMD  GDM  ECC
  #0 23354  312  192  407   0    0    ON OFF  R0W0   0    0   1T    ON   0 
  #1 23354  312  192  407   0    0    ON OFF  R0W0   0    0   1T    ON   0 
      MRD:PDA   MOD:PDA  WRMPR STAG PDM RDDATA WRD  WRL  RDL  XS   XP CPDED
  #0   40  32    40  32    24    7 0:F:0   24   6   22   34  852   21   14 
  #1   40  32    40  32    24    7 0:F:0   24   6   22   36  852   21   14 
                                                                           
 DIMM Geometry for channel #0                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0                                                                  
       #1    32    1     65536      1024          16384         KF560C36-32
 DIMM Geometry for channel #1                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0                                                                  
       #1    32    1     65536      1024          16384         KF560C36-32

Dumped memory controller config (taken from #511 )

root@pendual:~/Tips/C# # channel 0
./zencli smu 0x50100
[0x00050100] READ(smu) = 0x80000701 (2147485441)
   60   56   52   48   44   40   36   32   28   24   20   16   12   08   04   00
 0000 0000 0000 0000 0000 0000 0000 0000 1000 0000 0000 0000 0000 0111 0000 0001
root@pendual:~/Tips/C# # channel 1
./zencli smu 0x150100
[0x00150100] READ(smu) = 0x80000701 (2147485441)
   60   56   52   48   44   40   36   32   28   24   20   16   12   08   04   00
 0000 0000 0000 0000 0000 0000 0000 0000 1000 0000 0000 0000 0000 0111 0000 0001
root@pendual:~/Tips/C# # channel 2
./zencli smu 0x250100
[0x00250100] READ(smu) = 0xffffffff (4294967295)
   60   56   52   48   44   40   36   32   28   24   20   16   12   08   04   00
 0000 0000 0000 0000 0000 0000 0000 0000 1111 1111 1111 1111 1111 1111 1111 1111
root@pendual:~/Tips/C# # channel 3
./zencli smu 0x350100
[0x00350100] READ(smu) = 0xffffffff (4294967295)
   60   56   52   48   44   40   36   32   28   24   20   16   12   08   04   00
 0000 0000 0000 0000 0000 0000 0000 0000 1111 1111 1111 1111 1111 1111 1111 1111
root@pendual:~/Tips/C# 

another oddity is that lshw -C memory shows the clock as 505MHz

  *-memory
       description: System Memory
       physical id: 13
       slot: System board or motherboard
       size: 64GiB
     *-bank:0
          description: [empty]
          product: Unknown
          vendor: Unknown
          physical id: 0
          serial: Unknown
          slot: DIMM 0
     *-bank:1
          description: DIMM Synchronous Unbuffered (Unregistered) 4800 MHz (0.2 ns)
          product: KF560C36-32
          vendor: Kingston
          physical id: 1
          serial: 3D09639D
          slot: DIMM 1
          size: 32GiB
          width: 64 bits
          clock: 505MHz (2.0ns)
     *-bank:2
          description: [empty]
          product: Unknown
          vendor: Unknown
          physical id: 2
          serial: Unknown
          slot: DIMM 0
     *-bank:3
          description: DIMM Synchronous Unbuffered (Unregistered) 4800 MHz (0.2 ns)
          product: KF560C36-32
          vendor: Kingston
          physical id: 3
          serial: 2F096366
          slot: DIMM 1
          size: 32GiB
          width: 64 bits
          clock: 505MHz (2.0ns)
`dmidecode --type 17` showing correct rank of DIMMs
# dmidecode 3.5
Getting SMBIOS data from sysfs.
SMBIOS 3.6.0 present.
# SMBIOS implementations newer than version 3.5.0 are not
# fully supported by this version of dmidecode.

Handle 0x0016, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x0013
	Error Information Handle: 0x0015
	Total Width: Unknown
	Data Width: Unknown
	Size: No Module Installed
	Form Factor: Unknown
	Set: None
	Locator: DIMM 0
	Bank Locator: P0 CHANNEL A
	Type: Unknown
	Type Detail: Unknown

Handle 0x0018, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x0013
	Error Information Handle: 0x0017
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 32 GB
	Form Factor: DIMM
	Set: None
	Locator: DIMM 1
	Bank Locator: P0 CHANNEL A
	Type: DDR5
	Type Detail: Synchronous Unbuffered (Unregistered)
	Speed: 4800 MT/s
	Manufacturer: Kingston
	Serial Number: 3D09639D
	Asset Tag: Not Specified
	Part Number: KF560C36-32
	Rank: 2
	Configured Memory Speed: 5600 MT/s
	Minimum Voltage: 1.1 V
	Maximum Voltage: 1.1 V
	Configured Voltage: 1.1 V
	Memory Technology: DRAM
	Memory Operating Mode Capability: Volatile memory
	Firmware Version:                   
	Module Manufacturer ID: Bank 2, Hex 0x98
	Module Product ID: Unknown
	Memory Subsystem Controller Manufacturer ID: Unknown
	Memory Subsystem Controller Product ID: Unknown
	Non-Volatile Size: None
	Volatile Size: 32 GB
	Cache Size: None
	Logical Size: None

Handle 0x001B, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x0013
	Error Information Handle: 0x001A
	Total Width: Unknown
	Data Width: Unknown
	Size: No Module Installed
	Form Factor: Unknown
	Set: None
	Locator: DIMM 0
	Bank Locator: P0 CHANNEL B
	Type: Unknown
	Type Detail: Unknown

Handle 0x001D, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x0013
	Error Information Handle: 0x001C
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 32 GB
	Form Factor: DIMM
	Set: None
	Locator: DIMM 1
	Bank Locator: P0 CHANNEL B
	Type: DDR5
	Type Detail: Synchronous Unbuffered (Unregistered)
	Speed: 4800 MT/s
	Manufacturer: Kingston
	Serial Number: 2F096366
	Asset Tag: Not Specified
	Part Number: KF560C36-32
	Rank: 2
	Configured Memory Speed: 5600 MT/s
	Minimum Voltage: 1.1 V
	Maximum Voltage: 1.1 V
	Configured Voltage: 1.1 V
	Memory Technology: DRAM
	Memory Operating Mode Capability: Volatile memory
	Firmware Version:                   
	Module Manufacturer ID: Bank 2, Hex 0x98
	Module Product ID: Unknown
	Memory Subsystem Controller Manufacturer ID: Unknown
	Memory Subsystem Controller Product ID: Unknown
	Non-Volatile Size: None
	Volatile Size: 32 GB
	Cache Size: None
	Logical Size: None
One more thing that cought my attention was the low memory bandwidth (29895MiB/s)
root@pendual:~# sysbench memory --memory-block-size=1G --memory-total-size=10G run
sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time


Running memory speed test with the following options:
  block size: 1048576KiB
  total size: 10240MiB
  operation: write
  scope: global

Initializing worker threads...

Threads started!

Total operations: 10 (   29.20 per second)

10240.00 MiB transferred (29895.94 MiB/sec)


General statistics:
    total time:                          0.3418s
    total number of events:              10

Latency (ms):
         min:                                   34.14
         avg:                                   34.16
         max:                                   34.19
         95th percentile:                       34.33
         sum:                                  341.62

Threads fairness:
    events (avg/stddev):           10.0000/0.00
    execution time (avg/stddev):   0.3416/0.00

cloud11665 avatar Apr 14 '25 23:04 cloud11665

Thank you for your report. T'il we do get specifications and explanations from manufacturer to decode UMC registers, that issue is going to last. Zen gen 1 to 3 is giving DDR4 ranks count because reversing a reserved bit had statistically shown its usefulness. This was for DDR4. I feel myself fortunate that timings are correctly decoded. Reversing DDR5 is a matter of using different DIMM topology (banks, ranks, columns, rows); each time, as you did with SMU, dumping registers and guessing bits. As you are showing, some hexadecimal values like 0xffffffff 0x12345678 and so on, mean nothing here. Zero can however be a meaningful value. This is what I will do if I was having your hardware and a bunch of DIMMs.

Thank you for the offer tools results you are posting, I can't comment what they are doing.
BIOS and BMC screen parts are better reference, because like CoreFreq, they aim to show the UMC configuration rather than static SPD.

Do you have your processor architecture PPR specification in hands ?

(EDIT: English and link fixed)

cyring avatar Apr 14 '25 23:04 cyring

I'll provide bios screenshots once I boot into it. I'm currently trying to boot 128GiB, but it's hanging on memory training.

Regarding other tools, I think that they are getting it correct by accident as I do not see a codepath for ddr5 in lshw

I do not have any AMD documentation, but I'm willing to learn and help.

cloud11665 avatar Apr 14 '25 23:04 cloud11665

I'll provide bios screenshots once I boot into it. I'm currently trying to boot 128GiB, but it's hanging on memory training.

Regarding other tools, I think that they are getting it correct by accident as I do not see a codepath for ddr5 in lshw

Are you sure those are SMU/UMC registers ? Or SPD data retrieval by i2c ? I wonder when you will BIOS tweak your 128 GB DDR5 if that tool will match ?

I do not have any AMD documentation, but I'm willing to learn and help.

Doc Hub Search for PPR keyword and select your processor family/model. If not, another doc from same generation can make it.

cyring avatar Apr 15 '25 00:04 cyring

re lshw: it reads the SPD eeprom

cloud11665 avatar Apr 15 '25 00:04 cloud11665

Something went wrong during post and now the mainboard is not booting even with one stick. I was running 2x32GB at 5600 36-38-38-134 1.4v (the kit is rated for 6000MT/s 36-36-38 1.35V on EXPO1, but doesn't boot into these settings) I then added the other 2 sticks but it was stuck on memory training for about an hour so I killed it and after that it didn't post even with a single stick. I'll re-flash bios and start from there.

Are there any particular settings in the bios you'd want me to check if they are reflected in the memory controller view?

cloud11665 avatar Apr 15 '25 12:04 cloud11665

Something went wrong during post and now the mainboard is not booting even with one stick. I was running 2x32GB at 5600 36-38-38-134 1.4v (the kit is rated for 6000MT/s 36-36-38 1.35V on EXPO1, but doesn't boot into these settings) I then added the other 2 sticks but it was stuck on memory training for about an hour so I killed it and after that it didn't post even with a single stick. I'll re-flash bios and start from there.

Are there any particular settings in the bios you'd want me to check if they are reflected in the memory controller view?

Nothing in a hurry. Try stabilizing you DRAM first ;)

Fyi, most of my DIMM problems came from insufficient voltage DIMM

cyring avatar Apr 15 '25 12:04 cyring

okay, after reflashing the BIOS and clearing CMOS, here are the default DRAM timings as reported in BIOS:

Image

Image

Image

transcribed: -- primary -- Tcl: 40 Trcd: 39 Trp: 39 Tras: 77 -- secondary -- Trc: 116 Twr: 72 Refresh Interval: 9347 Trfc1: 708 Trfc2: 384 Trfcsb: 312 Trtp: 18 TrrdL: 12 TrrdS: 8 Tfaw: 32 TwtrL: 24 TwtrS: 6 TrdrdScl: 5 TrdrdSd: 9 Trdrddd: 9 TwrwrScl: 17 TwrwrSc: 1 TwrwrSd: 7 TwrwrDd: 7 Twrrd: 7 Trdwr: 16

root@pendual:~/CoreFreq/build# corefreq-cli -M
                              Zen UMC  [14E0]                              
Controller #0                                               Single Channel 
 Bus Rate  2400 MHz       Bus Speed 2395 MHz           DDR5 Speed 4790 MT/s
                                                                           
 Cha   CL  RCDr RCDw  RP  RAS   RC  RRDs RRDl FAW  WTRs WTRl  WR  clRR clWW
  #0   40   39   39   39   77  116    8   12   32    6   24   72    5   17 
      CWL  RTP RdWr WrRd scWW sdWW ddWW scRR sdRR ddRR drRR drWW drWR drRRD
  #0   38   18   16    7    1    7    7    1    9    9    0    0    0    0 
      REFI RFC1 RFC2 RFCsb RCPB RPPB BGS:Alt  Ban  Page  CKE  CMD  GDM  ECC
  #0  9347  312  192  312   0    0    ON OFF  R0W0   0    0   1T    ON   0 
      MRD:PDA   MOD:PDA  WRMPR STAG PDM RDDATA WRD  WRL  RDL  XS   XP CPDED
  #0   34  32    34  32    24    7 0:F:1   28   6   26   34  732   18   12 
                                                                           
 DIMM Geometry for channel #0                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0                                                                  
       #1    32    1     65536      1024          16384         KF560C36-32

cloud11665 avatar Apr 15 '25 14:04 cloud11665

okay, after reflashing the BIOS and clearing CMOS, here are the default DRAM timings as reported in BIOS:

Thank you. You confirm having two DIMMs but CoreFreq reports only a single one ? No other UMC controller listed ?

First look, tRFC needs a fix: perhaps twice the value. Don't know register rules.

cyring avatar Apr 15 '25 15:04 cyring

It was tested only with a single DIMM, my bad.

here is output with 2 DIMMs

                              Zen UMC  [14E0]                              
Controller #0                                                Dual Channel  
 Bus Rate  1800 MHz       Bus Speed 1800 MHz           DDR5 Speed 3600 MT/s
                                                                           
 Cha   CL  RCDr RCDw  RP  RAS   RC  RRDs RRDl FAW  WTRs WTRl  WR  clRR clWW
  #0   30   29   29   29   58   87    8    9   32    5   18   54    2   11 
  #1   30   29   29   29   58   87    8    9   32    5   18   54    2   11 
      CWL  RTP RdWr WrRd scWW sdWW ddWW scRR sdRR ddRR drRR drWW drWR drRRD
  #0   28   14   15    6    1    7    7    1    8    8    0    0    0    0 
  #1   28   14   15    6    1    7    7    1    8    8    0    0    0    0 
      REFI RFC1 RFC2 RFCsb RCPB RPPB BGS:Alt  Ban  Page  CKE  CMD  GDM  ECC
  #0  7006  312  192  234   0    0    ON OFF  R0W0   0    0   1T    ON   0 
  #1  7006  312  192  234   0    0    ON OFF  R0W0   0    0   1T    ON   0 
      MRD:PDA   MOD:PDA  WRMPR STAG PDM RDDATA WRD  WRL  RDL  XS   XP CPDED
  #0   26  32    26  32    24    7 0:F:0   18   6   16   32  548   14    9 
  #1   26  32    26  32    24    7 0:F:0   18   6   16   32  548   14    9 
                                                                           
 DIMM Geometry for channel #0                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0                                                                  
       #1    32    1     65536      1024          16384         KF560C36-32
 DIMM Geometry for channel #1                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0                                                                  
       #1    32    1     65536      1024          16384         KF560C36-32

also I am finally starting to get good numbers wrt. bandwidth! 16384.00 MiB transferred (36872.15 MiB/sec)

cloud11665 avatar Apr 15 '25 16:04 cloud11665

It was tested only with a single DIMM, my bad.

Not that bad: it has shown CoreFreq can detect a single DIMM.

cyring avatar Apr 15 '25 17:04 cyring

Hello,

Can you please pull the latest commit of Corefreq and the Memory Controller output ? This should be version 2.0.5

cyring avatar May 22 '25 06:05 cyring

                              Zen UMC  [14E0]                              
Controller #0                                                Dual Channel  
 Bus Rate  2800 MHz       Bus Speed 2800 MHz           DDR5 Speed 5600 MT/s
                                                                           
 Cha   CL  RCDr RCDw  RP  RAS   RC  RRDs RRDl FAW  WTRs WTRl  WR  clRR clWW
  #0   36   38   38   38   80  118    8   14   32    7   28   84    7   21 
  #1   36   38   38   38   80  118    8   14   32    7   28   84    7   21 
      CWL  RTP RdWr WrRd scWW sdWW ddWW scRR sdRR ddRR drRR drWW drWR drRRD
  #0   34   21   19    8    1    9    9    1    9    9    0    0    0    0 
  #1   34   21   20    8    1    9    9    1    9    9    0    0    0    0 
      REFI RFC1 RFC2 RFCsb RCPB RPPB BGS:Alt  Ban  Page  CKE  CMD  GDM  ECC
  #0 10892  312  192  390   0    0    ON OFF  R0W0   0    0   1T    ON   0 
  #1 10892  312  192  390   0    0    ON OFF  R0W0   0    0   1T    ON   0 
      MRD:PDA   MOD:PDA  WRMPR STAG PDM RDDATA WRD  WRL  RDL  XS   XP CPDED
  #0   40  32    40  32    24    7 0:F:0   24   6   22   34  852   21   14 
  #1   40  32    40  32    24    7 0:F:0   24   6   22   36  852   21   14 
                                                                           
 DIMM Geometry for channel #0                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0                                                                  
       #1    32    2     65536      1024          32768         KF560C36-32
 DIMM Geometry for channel #1                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0                                                                  
       #1    32    2     65536      1024          32768         KF560C36-32

working perfectly!

cloud11665 avatar May 26 '25 20:05 cloud11665

working perfectly!

Great!

  • I found the KF560C36-32 datasheet and it appears we have a match with profile XMP Profile #2: DDR5-5600 CL36-38-38 https://www.kingston.com/datasheets/KF560C36BBEK2-32.pdf

  • It took me also a while to understand how to decode the ranks count from the enabled chip select. Unfortunately a regression has been recently encountered on 9950X : output at https://github.com/cyring/CoreFreq/discussions/544#discussioncomment-13255995

  • I noticed there is no page for your 7950x3D in the Wiki/CPU support I will appreciate if you can post various screenshots in different load cases and CLI outputs corefreq-cli -s -n -m -n -k -n -M -n -C 1; or you can create your own public page, and I will add its link into the Wiki.

Thank you

cyring avatar May 26 '25 22:05 cyring