Quickenshtein icon indicating copy to clipboard operation
Quickenshtein copied to clipboard

Branch Miss Analysis

Open Turnerj opened this issue 5 years ago • 2 comments

Similar to how damageboy does in their blog post for sorting, it would be useful to track branch misses etc with the same tool - perf, a program on Linux.

This is how damageboy called it:

$ perf record -F max -e instructions,branches,branch-misses \
    ./Example --type-list DoublePumpOverlined \
              --size-list 100000 --max-loops 1000 --no-check

Does perf work well in WSL? What exactly do I need the example program to do to be compatible with it?

This could be a useful utility to make sure we are optimised with our branching logic and whether there are any gains by optimising this further.

Turnerj avatar Jun 25 '20 08:06 Turnerj

While maybe not as accurate as perf on Linux, it is possible to achieve this with ETW on Windows and the support for Hardware Counters in BDN.

See: https://adamsitnik.com/Hardware-Counters-Diagnoser/

AddHardwareCounters(HardwareCounter.BranchMispredictions, HardwareCounter.CacheMisses);

Bartosz Adamczewski on Twitter gave some good advice for using it:

1. Works only on Windows.
2. You have to run as admin.
3. Accuracy is limited. So close calls should be disregarded, focus on orders of magnitude differences.
4. No hyper V virtualization support (I think that this might already work)

Example of it working in the Basic Benchmark:

image

May leave this open until I can work out an easier way of running the benchmark as an admin directly from VS (sometimes though, not running as an admin all the time in case it interferes with anything else).

Turnerj avatar Jul 11 '20 07:07 Turnerj

Perf counters and virtualization are a special pain in this world.

Between what your specific CPU supports under virtualization, what WSL2 (I assume v1 is out of scope) or other VMs support...

It's a bit of a mess, and I, for one, have mostly refrained from using it, simply dropping to linux and using it's perf tool with the proper environment vars.

What I mostly appreciate about perf is that it's very friendly for explortion, and there's a large body of knowledge about what every counter does, compared to the limited amount of information and coverage provided by ETW.

Specifically for the counter you mention, I don't think there is a problem with ETW.

damageboy avatar Jul 11 '20 11:07 damageboy