async-executor icon indicating copy to clipboard operation
async-executor copied to clipboard

Added local queue scheduling and "next_task" optimization

Open nullchinchilla opened this issue 4 years ago • 0 comments

Two major changes significantly improve performance:

  • When Executor::run() is called, a handle to the local queue and ticker are cached into TLS. This lets tasks schedule to a thread-local queue rather than always to the global queue.
  • Within the local queue, we implement a next_task optimization (see https://tokio.rs/blog/2019-10-scheduler) to greatly reduce context-switch costs in message-passing patterns. We avoid putting the same task into next_task twice to avoid starvation.

Through both unit testing and production deployment in https://github.com/geph-official/geph4, whose QUIC-like sosistab protocol is structured in an actor-like fashion that greatly stresses the scheduler, I see significant improvements in real-world throughput (up to 30%, and this is in a server dominated by cryptography CPU usage) and massive improvements in microbenchmarks (up to 10x faster in the yield_now benchmark and similar context-switch benchmarks). I see no downsides --- the code should gracefully fall back to pushing to the global queeu in case e.g. nesting Executors invalidates the TLS cache.

I also added criterion benchmarks.

nullchinchilla avatar Mar 20 '21 01:03 nullchinchilla