divan icon indicating copy to clipboard operation
divan copied to clipboard

Bimodal benchmark results

Open epage opened this issue 7 months ago • 2 comments

I find that my benchmark results center on two different data points which makes doing before / after comparisons difficult. My suspicion is that this is due to P/E cores.

An idea would be to pin benchmark to one type of core or specific cores. I know of at least gdt-cpus for doing this.

epage avatar May 22 '25 14:05 epage

Should we also consider setting up or warning about other areas to get more consistent results, e.g. https://easyperf.net/blog/2019/08/02/Perf-measurement-environment-on-Linux#upd-09-aug-2019 which also points to https://github.com/parttimenerd/temci and https://github.com/travisdowns/uarch-bench/blob/master/uarch-bench.sh

epage avatar May 22 '25 14:05 epage

Divan used to set CPU affinity on initial start but that naïve approach ran into the issue of multi-threaded benchmarks not running on separate cores. That issue can probably be solved by pinning CPU cores before recording samples and unpinning after. However, this would affect benchmarks that spawn their own threads, such as the Blake3Par example.

Another approach is to detect whether the thread has changed CPU cores during the benchmark, and then re-run if it did. The x86 rdtscp instruction provides a core ID that could probably be used for this. However, this doesn't solve for changes in core between samples.

I'll gladly take suggestions since this has been something I've wanted to solve since early on. I imagine this issue will become more prevalent as P/E core architectures become more popular. I also wonder if discrepancies between P/E cores include whether the CPU's timer precision is stable.

nvzqz avatar May 22 '25 15:05 nvzqz