renaissance icon indicating copy to clipboard operation
renaissance copied to clipboard

Philosophers: Inverse scalability "problem"

Open shipilev opened this issue 2 months ago • 1 comments

We have been studying the performance of philosophers on large machines, and realized that the number of CPUs on the machine selects the number of philosophers in the benchmark.

This means that machines that run different number of CPUs run different workloads, misleading the cross-hardware comparisons. AFAICS, this is not what usual benchmarks do: in most benchmarks, higher available hardware parallelism performs globally same amount of work, either showing improvement due to parallelism, or degradation due to contention. In philosophers, adding hardware parallelism just makes benchmark slower, because the global amount of work is larger, on top of usual contention effects.

The easy way to demonstrate this is overriding -XX:ActiveProcessorCount=# on a large 64-core machine:

$ shipilev-jdk21u-dev/build/linux-aarch64-server-release/images/jdk/bin/java -Xmx4g -Xms4g -XX:+AlwaysPreTouch -XX:+UnlockDiagnosticVMOptions -XX:ActiveProcessorCount=... -jar renaissance-jmh-0.15.0.jar Philosophers -f 5 -wi 5 -i 5

ActiveProcessorCount=1:    230.081 ± 12.516 ms/op
ActiveProcessorCount=2:   1570.336 ± 75.888 ms/op
ActiveProcessorCount=4:   1893.643 ± 85.768 ms/op
ActiveProcessorCount=8:   2466.867 ± 114.564 ms/op
ActiveProcessorCount=16:  3374.587 ± 182.243 ms/op
ActiveProcessorCount=32:  5097.616 ± 330.096 ms/op
ActiveProcessorCount=64: 10788.201 ± 1470.015 ms/op

(The benchmark also trashes hard when all CPUs are busy, but I think that is just a way it works.)

I don't have a good solution for this, except maybe setting the number of philosophers at some fixed value.

shipilev avatar Apr 17 '24 19:04 shipilev