BenchmarkDotNet icon indicating copy to clipboard operation
BenchmarkDotNet copied to clipboard

Add support for using non standard HW counter

Open xoofx opened this issue 4 years ago • 9 comments

Hello BenchmarkDotNet folks!

This PR is for discussion. I should have opened an issue maybe, but as I'm using this fix to actually see the counters in my project, better just open this PR directly, feel free to close it, kill it if you don't like it! 😅

So my issue is that perf counters are very frustrating to work with (nothing to blame for BenchmarkdDotNet, you are not responsible) because depending on your processors, you might not be able to have any relevant in BenchmarkDotNet because the counters exposed are limited and are not matching all the counters that can be exposed by Windows.

For example, on my machine (Windows 10) and CPU (AMD Ryzen 3900X), the HardwareCounter.CacheMisses is not available. I might be able to use instead DCL1DTLBMissL2DTLBHitand DCL1DTLBMissL2DTLBMiss (although I'm not entirely sure myself it is relatively equivalent, anyway) but the API HardwareCounter is based on an enum instead of allowing me to pass a string.

So this PR is changing slightly the API to allow to pass custom HW counter names like this:

    [HardwareCounters("TotalIssues/Insts", "DCL1DTLBMissL2DTLBHit/L1Miss-L2Hit", "DCL1DTLBMissL2DTLBMiss/L1Miss-L2Miss")]
   public void Managed() { 
      // ... code to benchmark
   }

The / is allowing to have a display name.

It will display the results like this for the config above:

|  Method |     Mean |     Error |    StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated | L1Miss-L2Hit/Op |  Insts/Op | L1Miss-L2Miss/Op |
|-------- |---------:|----------:|----------:|------:|------:|------:|----------:|----------------:|----------:|-----------------:|
| Managed | 3.159 ms | 0.2923 ms | 0.0452 ms |     - |     - |     - |         - |           1,237 | 3,758,266 |            8,190 |

// * Hints *
Outliers
  BenchRunWorldProcessor.Managed: IterationCount=5, LaunchCount=1, WarmupCount=3 -> 1 outlier  was  removed (3.47 ms)

// * Legends *
  Mean             : Arithmetic mean of all measurements
  Error            : Half of 99.9% confidence interval
  StdDev           : Standard deviation of all measurements
  Gen 0            : GC Generation 0 collects per 1000 operations
  Gen 1            : GC Generation 1 collects per 1000 operations
  Gen 2            : GC Generation 2 collects per 1000 operations
  Allocated        : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
  L1Miss-L2Hit/Op  : Hardware counter 'DCL1DTLBMissL2DTLBHit' per single operation
  Insts/Op         : Hardware counter 'TotalIssues' per single operation
  L1Miss-L2Miss/Op : Hardware counter 'DCL1DTLBMissL2DTLBMiss' per single operation
  1 ms             : 1 Millisecond (0.001 sec)

I tried to keep the API similar so you can still use HardwareCounter.TotalIssues and others, but as per attribute constraints, we have to use strings instead of structs (previously an enum)

Side note, I have been running my own small benchmark with HW counters and I'm seeing very different result in the counters printed by BenchmarkDotNet, and also from the one running with AMDuprof.

xoofx avatar Apr 28 '20 10:04 xoofx

Sorry, I don't know how but I killed your comment @filipnavara . so responding here:

Yeah, we could have a HardwareCounterInfo on the side (that would be the HardwareCounter of this PR) and keep the old HardwareCounter enum for the attributes.

xoofx avatar Apr 28 '20 11:04 xoofx

Sorry, I don't know how but I killed your comment @filipnavara . so responding here:

Yeah, we could have a HardwareCounterInfo on the side (that would be the HardwareCounter of this PR) and keep the old HardwareCounter enum for the attributes.

Sorry, I killed the comment myself. I meant to press "Start Review" instead of "Add Comment" and by the time I reached the changed definition of HardwareCounter I realized why you did it. It may still make sense to preserve backwards compatibility but I am not sure how much this attribute was used in practice.

filipnavara avatar Apr 28 '20 11:04 filipnavara

So I have slightly modified the PR to keep the previous enum HardwareCounter and to introduce HardwareCounterInfo. HardwareCounterInfo is used in all the places where HardwareCounter was previously used, except that you can still use the enum HardwareCounter directly with HardwareCounterAttribute

xoofx avatar Apr 28 '20 12:04 xoofx

But unfortunately, it seems that I can't get cache misses perf counters for AMD Ryzen with Windows perf counters... 😞 DcacheAccessesis working, but not DcacheMisses or DCMiss (they don't log anything on my machine) while I can see them with AMDuprof

xoofx avatar Apr 28 '20 12:04 xoofx

Side note, I have been running my own small benchmark with HW counters and I'm seeing very different result in the counters printed by BenchmarkDotNet, and also from the one running with AMDuprof.

Note that the counters in AMDuprof seem to be divided by a constant 4 compare to the one I'm getting with Windows perf counters. In BenchmarkDotNet, I haven't checked why they are off and by how much factor.

xoofx avatar Apr 28 '20 12:04 xoofx

Copying what I wrote on Twitter ...

I confirmed with the Windows team there is a bug here somewhere. They're asking if you can use "Feedback Hub" to file this bug. AMD support exists but some of the event selects are outdated or incorrect, and your report will help prioritize which ones to get to fixed first.

mjsabby avatar May 08 '20 11:05 mjsabby

Closing as the problem is in Windows (e.g for the cache misses)

xoofx avatar Apr 18 '21 19:04 xoofx

Reopening as we might still require to be able to collect specific counters. The predefined ones are not able to collect special counters that could be relevant for a specific CPU/machine.

xoofx avatar Apr 19 '21 10:04 xoofx

Looks mostly good to me. Just a couple comments, and needs merge conflicts resolved.

Oh, it's quite a old PR, will have to dig into it because I completely forgot the details. 😅 I'll try to have a look next week.

xoofx avatar Jul 21 '23 07:07 xoofx