intel-cmt-cat icon indicating copy to clipboard operation
intel-cmt-cat copied to clipboard

pqos does not work on AMD Zen2/Rome parts since v23.08

Open iximeow opened this issue 1 year ago • 1 comments

building and running pqos on a Rome CPU (EPYC 7662 in particular) results in:

DEBUG: Detected core 123, socket 0, NUMAnode 14, L2 ID 59, L3 ID 14, APICID 119
DEBUG: Detected core 124, socket 0, NUMAnode 15, L2 ID 60, L3 ID 15, APICID 121
DEBUG: Detected core 125, socket 0, NUMAnode 15, L2 ID 61, L3 ID 15, APICID 123
DEBUG: Detected core 126, socket 0, NUMAnode 15, L2 ID 62, L3 ID 15, APICID 125
DEBUG: Detected core 127, socket 0, NUMAnode 15, L2 ID 63, L3 ID 15, APICID 127
ERROR: RDMSR failed for reg[0xca0] on lcore 0
ERROR: Error reading SNC information!
ERROR: Error encounter in monitoring discovery!
ERROR: discover_capabilities() error 1
Error initializing PQoS library!

for v23.08 i can build and run pqos without issue, v23.11 and later (including current master) yield the above. this seems to have been the case since SNC support was added; these cores do not support the PQOS_MSR_SNC_CFG MSR and so the attempt to discover SNC support by reading MSR 0xCA0 via msr_read in hw_cap_mon_snc_state errors (linux returns EIO for the pread64).

that error causes an early return with PQOS_RETVAL_ERROR which errors with Error reading SNC information! from the caller hw_cap_mon_discover and onward until pqos exits.

i'd offer a patch, but i'm not immediately sure how to recover and in the face of missing SNC support. for now i'm proceeding with v23.08 to work around this.

iximeow avatar May 10 '24 21:05 iximeow

HI @iximeow , Please provide a patch. We will look into that. Thanks.

rkanagar avatar Jun 13 '24 10:06 rkanagar