thread-pool icon indicating copy to clipboard operation
thread-pool copied to clipboard

Investigate use of `std::atomic_flag` instead of `std::binary_semaphore`

Open DeveloperPaul123 opened this issue 2 years ago • 2 comments

std::atomic_flag is the only atomic primitive guaranteed to be lock free. It would be interesting to see if this has any positive impact on performance over std::binary_semaphore.

DeveloperPaul123 avatar Jul 03 '23 16:07 DeveloperPaul123

Did a quick benchmark on quick-bench and it seems that std::binary_semaphore has the best performance when it comes to ping/pong which I think matches well with how we're using it in the thread pool (as a signal mechanism).

https://quick-bench.com/q/JkZjpTgsjQkSiyI20IRcZtEFNso

JkZjpTgsjQkSiyI20IRcZtEFNso

DeveloperPaul123 avatar Jul 03 '23 17:07 DeveloperPaul123

I wrote a quick benchmark and ran it locally on windows and am getting inconsistent results. I think I will need to do a more proper test with pyperf on a Linux machine to get better numbers.

relative ms/op op/s err% total Thread signaling
100.0% 451.75 2.21 1.8% 53.95 std::atomic_flag
99.2% 455.40 2.20 2.0% 54.42 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 444.76 2.25 2.1% 53.20 std::atomic_flag
98.4% 452.16 2.21 3.0% 53.78 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 485.05 2.06 0.3% 57.51 std::atomic_flag
103.0% 470.99 2.12 0.8% 57.44 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 457.13 2.19 1.9% 55.43 std::atomic_flag
96.9% 471.90 2.12 3.8% 56.81 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 481.77 2.08 2.8% 56.86 std::atomic_flag
105.2% 457.91 2.18 4.1% 54.98 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 465.74 2.15 0.4% 55.85 std::atomic_flag
101.3% 459.84 2.17 0.6% 55.00 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 453.31 2.21 0.9% 54.72 std::atomic_flag
95.0% 477.07 2.10 2.4% 56.92 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 477.71 2.09 0.8% 57.59 std::atomic_flag
101.6% 470.13 2.13 0.6% 56.57 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 477.80 2.09 1.1% 57.12 std::atomic_flag
100.4% 475.79 2.10 1.8% 56.92 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 479.51 2.09 1.2% 58.10 std::atomic_flag
102.9% 465.78 2.15 4.3% 55.52 std::binary_semaphore

DeveloperPaul123 avatar Jul 06 '23 13:07 DeveloperPaul123

here are 10 runs on a linux system with pyperf system tune set up:

relative ms/op op/s err% total Thread signaling
100.0% 406.13 2.46 1.1% 48.78 std::atomic_flag
93.8% 432.78 2.31 1.9% 52.17 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 409.78 2.44 0.7% 49.23 std::atomic_flag
106.2% 385.80 2.59 0.7% 46.45 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 402.90 2.48 0.7% 48.30 std::atomic_flag
101.1% 398.52 2.51 0.8% 47.77 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 397.33 2.52 0.4% 47.64 std::atomic_flag
104.8% 379.24 2.64 0.7% 45.69 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 372.06 2.69 0.9% 44.75 std::atomic_flag
88.7% 419.39 2.38 2.2% 50.05 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 420.61 2.38 0.9% 50.24 std::atomic_flag
112.4% 374.31 2.67 0.8% 45.01 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 394.11 2.54 0.9% 47.54 std::atomic_flag
97.8% 403.07 2.48 0.6% 48.64 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 406.72 2.46 0.7% 48.58 std::atomic_flag
105.2% 386.67 2.59 1.1% 46.27 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 409.09 2.44 0.3% 49.27 std::atomic_flag
107.2% 381.71 2.62 1.0% 45.74 std::binary_semaphore
relative ms/op op/s err% total Thread signaling
100.0% 394.07 2.54 0.7% 47.26 std::atomic_flag
90.8% 434.10 2.30 0.6% 52.22 std::binary_semaphore

jtd-formlabs avatar Jul 05 '24 17:07 jtd-formlabs

@jtd-formlabs Thanks for doing that. It seems like they're essentially the same. At most we'd be saving tens of milliseconds, so it seems like it's not worth it to me.

DeveloperPaul123 avatar Jul 05 '24 17:07 DeveloperPaul123