BenchmarkDotNet
BenchmarkDotNet copied to clipboard
Unable to read some hardware counters on ZEN2 CPU
Having trouble reading 'CacheMisses' and 'InstructionsRetired' whenever I run the dotnet/performance benchmark suite. If I run the command from within the src\benchmarks\micro
directory:
dotnet run -c Release -f netcoreapp5.0 --filter '*Bilinear*' --counters CacheMisses InstructionsRetired
I get the following error message:
// Validating benchmarks:
The counter CacheMisses is not available. Please make sure you are Windows 8+ without Hyper-V
The counter InstructionRetired is not available. Please make sure you are Windows 8+ without Hyper-V
I am able to get some counters to read by enabling IBS in my BIOS (technically wasn't called IBS, I had to disable SVM to get it to work). So the following command does work for me:
dotnet run -c Release -f netcoreapp5.0 --filter '*Bilinear*' --counters BranchMispredictions
If I profile a C# application outside of BenchmarkDotNet using AMD uProf, I can get cache miss and instructions retired statistics. I am also able to read all of the counters on my Skylake machine using the BenchmarkDotNet CLI without any issues.
Would appreciate any help that can be provided!
Hi @alexcovington
BenchmarkDotNet is using TraceEvent
which internally uses ETW to get hardware counters information. I am afraid that there is an AMD-specific bug somewhere.
Could you please run the following command and share your output here?
tracelog.exe -profilesources Help
tracelog.exe might not be present in your $PATH so you can use Visual Studio Command Prompt to run this command:
Hi @adamsitnik, sorry for the wait on this. With COVID, I don't have physical access to this machine all the time, so I wasn't able to work on this till now.
Here's the output from tracelog running in admin Developer Command Prompt for VS 2019:
C:\Windows\System32>tracelog -profilesources Help
Id Name Interval Min Max
--------------------------------------------------------------
0 Timer 10000 1221 1000000
2 TotalIssues 65536 4096 2147483647
6 BranchInstructions 65536 4096 2147483647
8 DcacheMisses 65536 4096 2147483647
9 IcacheMisses 65536 4096 2147483647
11 BranchMispredictions 65536 4096 2147483647
13 FpInstructions 65536 4096 2147483647
20 IcacheIssues 65536 4096 2147483647
21 DcacheAccesses 65536 4096 2147483647
25 FPDispatchedFPUOps 65536 4096 2147483647
26 FPDispatchedFPUOpsAddExcludeJunk 65536 4096 2147483647
27 FPDispatchedFPUOpsMulExcludeJunk 65536 4096 2147483647
28 FPDispatchedFPUOpsStoreExcludeJunk 65536 4096 2147483647
29 FPDispatchedFPUOpsAddJunk 65536 4096 2147483647
30 FPDispatchedFPUOpsMulJunk 65536 4096 2147483647
31 FPDispatchedFPUOpsStoreJunk 65536 4096 2147483647
32 FPCyclesNoFPUOpsRetired 65536 4096 2147483647
33 FPDispathedFPUOpsWithFastFlag 65536 4096 2147483647
34 LSSegmentRegisterLoad 65536 4096 2147483647
35 LSSegmentRegisterLoadES 65536 4096 2147483647
36 LSSegmentRegisterLoadCS 65536 4096 2147483647
37 LSSegmentRegisterLoadSS 65536 4096 2147483647
38 LSSegmentRegisterLoadDS 65536 4096 2147483647
39 LSSegmentRegisterLoadFS 65536 4096 2147483647
40 LSSegmentRegisterLoadGS 65536 4096 2147483647
41 LSSegmentRegisterLoadHS 65536 4096 2147483647
42 LSResyncBySelfModifyingCode 65536 4096 2147483647
43 LSResyncBySnoop 65536 4096 2147483647
44 LSBuffer2Full 65536 4096 2147483647
45 LSLockedOperation 65536 4096 2147483647
46 LSLateCancelOperation 65536 4096 2147483647
47 LSRetiredCFLUSH 65536 4096 2147483647
48 LSRetiredCPUID 65536 4096 2147483647
49 DCAccess 65536 4096 2147483647
50 DCMiss 65536 4096 2147483647
51 DCRefillFromL2 65536 4096 2147483647
52 DCRefillFromL2Invalid 65536 4096 2147483647
53 DCRefillFromL2Shared 65536 4096 2147483647
54 DCRefillFromL2Exclusive 65536 4096 2147483647
55 DCRefillFromL2Owner 65536 4096 2147483647
56 DCRefillFromL2Modified 65536 4096 2147483647
57 DCRefillFromSystem 65536 4096 2147483647
58 DCRefillFromSystemInvalid 65536 4096 2147483647
59 DCRefillFromSystemShared 65536 4096 2147483647
60 DCRefillFromSystemExclusive 65536 4096 2147483647
61 DCRefillFromSystemOwner 65536 4096 2147483647
62 DCRefillFromSystemModified 65536 4096 2147483647
63 DCRefillCopyBack 65536 4096 2147483647
64 DCRefillCopyBackInvalid 65536 4096 2147483647
65 DCRefillCopyBackShared 65536 4096 2147483647
66 DCRefillCopyBackExclusive 65536 4096 2147483647
67 DCRefillCopyBackOwner 65536 4096 2147483647
68 DCRefillCopyBackModified 65536 4096 2147483647
69 DCL1DTLBMissL2DTLBHit 65536 4096 2147483647
70 DCL1DTLBMissL2DTLBMiss 65536 4096 2147483647
71 DCMisalignedDataReference 65536 4096 2147483647
72 DCLateCancelOfAnAccess 65536 4096 2147483647
73 DCEarlyCancelOfAnAccess 65536 4096 2147483647
74 DCOneBitECCError 65536 4096 2147483647
75 DCOneBitECCErrorScrubberError 65536 4096 2147483647
76 DCOneBitECCErrorPiggybackScrubberError 65536 4096 2147483647
77 DCDispatchedPrefetchInstructions 65536 4096 2147483647
78 DCDispatchedPrefetchInstructionsLoad 65536 4096 2147483647
79 DCDispatchedPrefetchInstructionsStore 65536 4096 2147483647
80 DCDispatchedPrefetchInstructionsNTA 65536 4096 2147483647
190 BUCleanToDirty 65536 4096 2147483647
191 BUSharedToDirty 65536 4096 2147483647
81 BUInternalL2Request 65536 4096 2147483647
82 BUInternalL2RequestICFill 65536 4096 2147483647
83 BUInternalL2RequestDCFill 65536 4096 2147483647
84 BUInternalL2RequestTLBReload 65536 4096 2147483647
85 BUInternalL2RequestTagSnoopRequest 65536 4096 2147483647
86 BUInternalL2RequestCancelledRequest 65536 4096 2147483647
87 BUFillRequestMissedInL2 65536 4096 2147483647
88 BUFillRequestMissedInL2ICFill 65536 4096 2147483647
89 BUFillRequestMissedInL2DCFill 65536 4096 2147483647
90 BUFillRequestMissedInL2TLBLoad 65536 4096 2147483647
91 BUFillIntoL2 65536 4096 2147483647
92 BUFillIntoL2DirtyL2Victim 65536 4096 2147483647
93 BUFillIntoL2VictimFromL1 65536 4096 2147483647
94 ICFetch 65536 4096 2147483647
95 ICMiss 65536 4096 2147483647
96 ICRefillFromL2 65536 4096 2147483647
97 ICRefillFromSystem 65536 4096 2147483647
98 ICL1TLBMissL2TLBHit 65536 4096 2147483647
99 ICL1TLBMissL2TLBMiss 65536 4096 2147483647
100 ICResyncBySnoop 65536 4096 2147483647
101 ICInstructionFetchStall 65536 4096 2147483647
102 ICReturnStackHit 65536 4096 2147483647
103 ICReturnStackOverflow 65536 4096 2147483647
104 FRRetiredx86Instructions 65536 4096 2147483647
105 FRRetireduops 65536 4096 2147483647
106 FRRetiredBranches 65536 4096 2147483647
107 FRRetiredBranchesMispredicted 65536 4096 2147483647
108 FRRetiredTakenBranches 65536 4096 2147483647
109 FRRetiredTakenBranchesMispredicted 65536 4096 2147483647
110 FRRetiredFarControlTransfers 65536 4096 2147483647
111 FRRetiredResyncsNonControlTransferBranches 65536 4096 2147483647
112 FRRetiredNearReturns 65536 4096 2147483647
113 FRRetiredNearReturnsMispredicted 65536 4096 2147483647
114 FRRetiredTakenBranchesMispredictedByAddressMiscompare 65536 4096 2147483647
115 FRRetiredFPUInstructions 65536 4096 2147483647
116 FRRetiredFPUInstructionsx87 65536 4096 2147483647
117 FRRetiredFPUInstructionsMMXAnd3DNow 65536 4096 2147483647
118 FRRetiredFPUInstructionsPackedSSEAndSSE2 65536 4096 2147483647
119 FRRetiredFPUInstructionsScalarSSEAndSSE2 65536 4096 2147483647
120 FRRetiredFastpathDoubleOpInstructions 65536 4096 2147483647
121 FRRetiredFastpathDoubleOpInstructionsLowOpInPosition0 65536 4096 2147483647
122 FRRetiredFastpathDoubleOpInstructionsLowOpInPosition1 65536 4096 2147483647
123 FRRetiredFastpathDoubleOpInstructionsLowOpInPosition2 65536 4096 2147483647
124 FRInterruptsMaskedCycles 65536 4096 2147483647
125 FRInterruptsMaskedWhilePendingCycles 65536 4096 2147483647
126 FRTakenHardwareInterrupts 65536 4096 2147483647
127 FRNothingToDispatch 65536 4096 2147483647
128 FRDispatchStalls 65536 4096 2147483647
129 FRDispatchStallsFromBranchAbortToRetire 65536 4096 2147483647
130 FRDispatchStallsForSerialization 65536 4096 2147483647
131 FRDispatchStallsForSegmentLoad 65536 4096 2147483647
132 FRDispatchStallsWhenReorderBufferFull 65536 4096 2147483647
133 FRDispatchStallsWhenReservationStationsFull 65536 4096 2147483647
134 FRDispatchStallsWhenFPUFull 65536 4096 2147483647
135 FRDispatchStallsWhenLSFull 65536 4096 2147483647
136 FRDispatchStallsWhenWaitingForAllQuiet 65536 4096 2147483647
137 FRDispatchStallsWhenFarControlOrResyncBranchPending 65536 4096 2147483647
138 FRFPUExceptions 65536 4096 2147483647
139 FRFPUExceptionsx87ReclassMicroFaults 65536 4096 2147483647
140 FRFPUExceptionsSSERetypeMicroFaults 65536 4096 2147483647
141 FRFPUExceptionsSSEReclassMicroFaults 65536 4096 2147483647
142 FRFPUExceptionsSSEAndx87MicroTraps 65536 4096 2147483647
143 FRNumberOfBreakPointsForDR0 65536 4096 2147483647
144 FRNumberOfBreakPointsForDR1 65536 4096 2147483647
145 FRNumberOfBreakPointsForDR2 65536 4096 2147483647
146 FRNumberOfBreakPointsForDR3 65536 4096 2147483647
147 NBMemoryControllerPageAccessEvent 65536 4096 2147483647
148 NBMemoryControllerPageAccessEventPageHit 65536 4096 2147483647
149 NBMemoryControllerPageAccessEventPageMiss 65536 4096 2147483647
150 NBMemoryControllerPageAccessEventPageConflict 65536 4096 2147483647
151 NBMemoryControllerPageTableOverflow 65536 4096 2147483647
152 NBMemoryControllerDRAMCommandSlotsMissed 65536 4096 2147483647
153 NBMemoryControllerTurnAround 65536 4096 2147483647
154 NBMemoryControllerTurnAroundDIMM 65536 4096 2147483647
155 NBMemoryControllerTurnAroundReadToWrite 65536 4096 2147483647
156 NBMemoryControllerTurnAroundWriteToRead 65536 4096 2147483647
157 NBMemoryControllerBypassCounter 65536 4096 2147483647
158 NBMemoryControllerBypassCounterHighPriority 65536 4096 2147483647
159 NBMemoryControllerBypassCounterLowPriority 65536 4096 2147483647
160 NBMemoryControllerBypassCounterDRAMControllerInterface 65536 4096 2147483647
161 NBMemoryControllerBypassCounterDRAMControllerQueue 65536 4096 2147483647
162 NBSizedCommands 65536 4096 2147483647
163 NBSizedCommandsNonPostWrSzByte 65536 4096 2147483647
164 NBSizedCommandsNonPostWrSzDword 65536 4096 2147483647
165 NBSizedCommandsWrSzByte 65536 4096 2147483647
166 NBSizedCommandsWrSzDword 65536 4096 2147483647
167 NBSizedCommandsRdSzByte 65536 4096 2147483647
168 NBSizedCommandsRdSzDword 65536 4096 2147483647
169 NBSizedCommandsRdModWr 65536 4096 2147483647
170 NBProbeResult 65536 4096 2147483647
171 NBProbeResultMiss 65536 4096 2147483647
172 NBProbeResultHit 65536 4096 2147483647
173 NBProbeResultHitDirtyWithoutMemoryCancel 65536 4096 2147483647
174 NBProbeResultHitDirtyWithMemoryCancel 65536 4096 2147483647
175 NBHyperTransportBus0Bandwidth 65536 4096 2147483647
176 NBHyperTransportBus0BandwidthCommandSent 65536 4096 2147483647
177 NBHyperTransportBus0BandwidthDataSent 65536 4096 2147483647
178 NBHyperTransportBus0BandwidthBufferReleaseSent 65536 4096 2147483647
179 NBHyperTransportBus0BandwidthNopSent 65536 4096 2147483647
180 NBHyperTransportBus1Bandwidth 65536 4096 2147483647
181 NBHyperTransportBus1BandwidthCommandSent 65536 4096 2147483647
182 NBHyperTransportBus1BandwidthDataSent 65536 4096 2147483647
183 NBHyperTransportBus1BandwidthBufferReleaseSent 65536 4096 2147483647
184 NBHyperTransportBus1BandwidthNopSent 65536 4096 2147483647
185 NBHyperTransportBus2Bandwidth 65536 4096 2147483647
186 NBHyperTransportBus2BandwidthCommandSent 65536 4096 2147483647
187 NBHyperTransportBus2BandwidthDataSent 65536 4096 2147483647
188 NBHyperTransportBus2BandwidthBufferReleaseSent 65536 4096 2147483647
189 NBHyperTransportBus2BandwidthNopSent 65536 4096 2147483647
With a Ryzen 7 1800X, I can report the same issue. It wouldn't recognize the CacheMisses perf counter because it's simply not listed as such; it's separated into DcacheMisses
and IcacheMisses
, and apparently that's exclusive to some AMD processors.
And having looked at the code itself, the error does not assume that the perf counter is not supported by the CPU itself.
I've recently ordered a PC with AMD CPU (ThreadRipper :D) and I am supposed to get it before the 31st of March. When I do, I am going to make sure that Hardware Counters works as expected on AMD
A bit related to this PR/issue https://github.com/dotnet/BenchmarkDotNet/pull/1438#issuecomment-620573164
Problem seems in Windows, ETW is not correctly reporting events for AMD CPU (reported here and to the feedback hub here)
After some experiment with ETW directly, I can see that DCRefillFromL2
is generating some numbers, so it could be used for L1 cache misses while DCRefillFromSystem
might be for L2 cache misses. I'm not 100% sure of that, trying to figure out how I can stabilize the numbers (fluctuating consistently quite a bit).
Has there been any progress on this issue? On my Ryzen 5 5600X machine I got the same error.:
The counter CacheMisses is not available. Please make sure you are Windows 8+ without Hyper-V
The counter BranchMispredictions is not available. Please make sure you are Windows 8+ without Hyper-V
@alexcovington @Rekkonnect @xoofx @dn9090 do you still experience this problem with the latest version of BenchmarkDotNet?
I believe that this issue should be resolved since BenchmarkDotNet v0.13.2 thanks to #2030. I just checked AMD Ryzen 9 7950X, everything works fine: the hardware counters are properly reported.
@AndreyAkinshin Thanks for checking in.
I don't have access to the system I was using when I originally reported the issue, but I was able to successfully read CacheMisses and InstructionRetired on Zen 3 and Zen 4 systems using BDN v0.13.5.
This seems to be resolved now, I'll go ahead and close the issue. Thanks everyone for the fix :).
I haven't checked since then, but if it has been fixed in Windows, it should be fixed for BDN. Thanks!
I am having trouble getting hardware counters working fully on a AMD Zen 3 5950X on Windows 10 using latest BDN 0.13.12. SVM is disabled in BIOS. Hyper-V is disabled. PerfView lists counters. And tracelog is attached.
Only BranchMispredictions
appears to be working. Others report as not available. Is that just how it is or is there something I can do to get these?
// * The counter CacheMisses is not available. Please make sure you are Windows 8+ without Hyper-V
// * The counter TotalCycles is not available. Please make sure you are Windows 8+ without Hyper-V
// * The counter InstructionRetired is not available. Please make sure you are Windows 8+ without Hyper-V
// * The counter LLCMisses is not available. Please make sure you are Windows 8+ without Hyper-V
// * The counter BranchInstructionRetired is not available. Please make sure you are Windows 8+ without Hyper-V
// * The counter BranchMispredictsRetired is not available. Please make sure you are Windows 8+ without Hyper-V