uarch-bench
uarch-bench copied to clipboard
A benchmark for low-level CPU micro-architectural features
Currently every file that defines benchmarks needs to declare a benchmark registration method that is called explicitly in `make_benches` in `main.cpp` which look like: ``` template GroupList make_benches() { GroupList...
Rather than compiling in benchmarks, it would be cool to allow benchmarks to be dynamically loaded from a shared object, allowing decoupling of the benchmark application and default benchmarks from...
If you include on the command line: --timer=libpfc --extra-events=FRONTEND_RETIRED.L1I_MISS You'll see this output: ``` Event 'FRONTEND_RETIRED.L1I_MISS' resolved to 'skl::FRONTEND_RETIRED:L1I_MISS:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0:fe_thres=0, short name: 'FRONTE' with code 0x5301c6 Event 'FRONTEND_RETIRED.L1I_MISS' resolved to 'skl::FRONTEND_RETIRED:L1I_MISS:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0:fe_thres=0,...
If you take a look at the core region of the innermost method in a benchmark in the libpfc case, you find a `rep stos` call inside the timed region...
Seems like this would be pretty difficult, but I'd love to have something like this working on other architectures, especially ARM.
You can currently run `uarch-bench` as non-root, but the experience isn't great (for example, you have to run the `uarch-bench` binary directly rather than use the wrapper script). We should...
Create benchmarks that investigate various aspects of store-forwarding behavior, such as: - store->load forwarding latency - misaligned load scenarios - store->load throughput - determine how far a load has to...
Add a benchmark to test for 4K aliasing, with a load reading from a distinct earlier store location which is a multiple of 4K away. See [here for example](https://software.intel.com/en-us/forums/intel-vtune-amplifier-xe/topic/606846).
In the flavor of: http://blog.stuffedcow.net/2014/01/x86-memory-disambiguation/
Ideally the user runs uarch-bench with all frequency scaling behaviors disabled, but this is not always possible. In the case that scaling is occurring, we still want to provide a...