yatp icon indicating copy to clipboard operation
yatp copied to clipboard

steal other workers only before going to sleep

Open sticnarf opened this issue 4 years ago • 8 comments

This PR is to mitigate the regression reported in https://github.com/tikv/tikv/issues/7122 by changing the strategy of pop.

It is found that CPU is mostly wasted in stealing other workers. The situation is especially terrible when there are lots of threads.

It makes little sense to steal other workers every time we call pop. We only need to make sure that the working threads never go to sleep while leaving other workers too busy. So we actually only need to steal other workers before going to sleep.

sticnarf avatar Mar 18 '20 10:03 sticnarf

How does it performs?

BusyJay avatar Mar 18 '20 10:03 BusyJay

How does it performs?

I don't have a machine for fully restoring the case in that issue. But the CPU usage is only about half of before (on a 20C40T machine with 32 threads in the pool)

image (ignore the middle one, that's unrelated)

sticnarf avatar Mar 18 '20 10:03 sticnarf

Cool, what does criterion say?

BusyJay avatar Mar 18 '20 10:03 BusyJay

Cool, what does criterion say?

Criterion shows 5% to 20% regression while CPU time is reduced more than 50%.

I have no idea what the regression comes from. I don't see imbalance among threads through perf...

sticnarf avatar Mar 18 '20 15:03 sticnarf

I find the criterion microbenchmark is strongly associated with sched_yield which is called in spinning. If we don't yield, we'll get dramatic (more than 50% in spawn_many) improvement...

Although that's not the real case, I doubt if we really should yield. It is noticed many time that sched_yield can lead to unexpected scheduling and can be harmful. (For example https://www.realworldtech.com/forum/?threadid=189711&curpostid=189752, the context is about spin locks though.)

sticnarf avatar Mar 19 '20 03:03 sticnarf

Not yield will lead to frequently sleep and awake. You should not just check spawn_many, but also other bench results.

BusyJay avatar Mar 19 '20 04:03 BusyJay

Not yield will lead to frequently sleep and awake. You should not just check spawn_many, but also other bench results.

All microbenchmarks show improvements except yield_many without yielding:

cargo bench -- 'yatp::future/1000$'
   Compiling yatp v0.0.1 (/home/yilin/Code/yatp)
    Finished bench [optimized] target(s) in 12.44s
     Running target/release/deps/yatp-166dd3367783e1e0

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 28 filtered out

     Running target/release/deps/chained_spawn-79665de118b0b186
Gnuplot not found, using plotters backend
chained_spawn/yatp::future/1000                                                                            
                        time:   [257.41 us 260.71 us 263.80 us]
                        change: [-43.176% -42.466% -41.781%] (p = 0.00 < 0.05)
                        Performance has improved.

     Running target/release/deps/ping_pong-20995795e71841e4
Gnuplot not found, using plotters backend
ping_pong/yatp::future/1000                                                                            
                        time:   [543.38 us 546.98 us 550.80 us]
                        change: [-9.8604% -8.8364% -7.3901%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

     Running target/release/deps/spawn_many-e1d556ffceaa12ce
Gnuplot not found, using plotters backend
spawn_many/yatp::future/1000                                                                            
                        time:   [429.74 us 437.24 us 444.88 us]
                        change: [-76.937% -75.944% -74.811%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

     Running target/release/deps/yield_many-05b96844c629a1a1
Gnuplot not found, using plotters backend
Benchmarking yield_many/yatp::future/1000: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 15.0s or reduce sample count to 40.
yield_many/yatp::future/1000                                                                             
                        time:   [2.9260 ms 2.9420 ms 2.9597 ms]
                        change: [-0.5663% +0.5283% +1.6560%] (p = 0.38 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  2 (2.00%) high severe

Yes, not yielding can lead to more frequent sleep and wake. But it's not that clear to me how it compares to some random scheduling among the OS.

sticnarf avatar Mar 19 '20 05:03 sticnarf

Very strange that I run benchmark on your code only ping_pong instead of spawn_many shows a significant regression, about 300%.

BusyJay avatar May 12 '20 16:05 BusyJay