pika
pika copied to clipboard
Investigate `small_vector` performance issue
pika::detail::small_vector seems to be significantly slower than boost::container::small_vector. It's unclear if it's "just" a bug in the implementation or if it's something more inherent in the use of the standard library features in the implementation.
We should:
- [ ] find a performance test that reproduces the regression (most likely something involving
future::thensincesmall_vectoris used for storing continuations) - [ ] profile/debug/whatever to find out if
pika::detail::small_vectoris fixable
If we can't find a suitable regression test within pika, the following DLA-Future test shows a clear performance drop: srun -n4 -c36 miniapp/miniapp_triangular_solver --m 20480 --n 20480 --mb 128 --nb 128 --grid-rows 2 --grid-cols 2 --nruns 5 --pika:use-process-mask (on the Piz Daint mc partition). The performance is about ~1150GFlop/s with Boost's small_vector and ~800GFlop/s with pika's.