userver icon indicating copy to clipboard operation
userver copied to clipboard

Optimize the scheduler

Open Anton3 opened this issue 1 year ago • 1 comments

When an engine::impl::TaskContext (roughly a stackful coroutine) is ready to start or continue running, it is pushed (scheduled) by engine::TaskProcessor into engine::TaskQueue, from which worker threads will then take it.

We currently use moodycamel::ConcurrentQueue and moodycamel::LightweightSemaphore for the task queue. It is very unoptimal, and we'd want to swap it for a more performant solution.

moodycamel::ConcurrentQueue is not an issue. It scales well and is not a bottleneck.

moodycamel::LightweightSemaphore IS an issue:

  1. It uses a single atomic for all threads, which becomes a bottleneck
  2. It wakes up a thread for each arriving task
  3. It uses spinning for each thread (when the queue is empty, the thread spins for 10000 iterations before starting to sleep on an OS semaphore), which is not a sin of itself, but coupled with (2) it causes problems as discussed below
  4. We typically want to keep CPU usage at ~50% load to cope with load spikes. But there are always the same number of main-task-processor threads as CPU threads. As a result, up to ~50% CPU can be taken up solely by spinning. Threads keep being woken up and spin again and again, fighting for resources with non-main-task-processor threads that need to do actual work

We need to replace the current engine::TaskQueue with some work-stealing queue with the following requirements:

  1. It should handle blocking while waiting for a task without resorting to a single synchronization point (like moodycamel::LightweightSemaphore does)
  2. Optional spinning should be kept, it can reduce thread context switches at the cost of some extra CPU work if done right
  3. Tasks should first be attempted to be scheduled on spinning threads, waking up sleeping threads only as a last resort measure. For example, we could walk the array of threads and first check if we can schedule the task onto threads with lower indices
  4. The current engine::TaskQueue should be kept for now. The user will specify the scheduler per TaskProcessor in the static config. The code could use e.g. a std::variant

We expect 2 kinds of easily benchmarkable benefits from the new scheduler:

  1. Scheduling in general should be optimized, see engine_task_yield_multiple_threads benchmark. Currently, on my PC, it can take around ~1200 ns per engine::Yield. It's expected that the new scheduler will show numbers of at most ~400ns, independently of the amount of TaskProcessor threads, up to the number of CPU threads/cores
  2. Spinning will not get in the way of the other workload. This could be checked by running N_PROC / 2 tasks, which engine::Yield in a loop, on each of 2 TaskProcessors, each of which has N_PROC threads

FAQ

Of what benefit will this project be for the contributor?

You'll get a pretty line in the résumé:

  • The project will show your competence with C++ language, C++ memory model (atomics), concurrency, asynchronous systems
  • The project has some space for research. Different approaches, algorithms, papers and libraries could be compared

How much time will the project require?

Something on the order of 1-2 man-work-months.

Can new external dependencies (libraries) be used?

Only tiny specialized libraries like moodycamel::ConcurrentQueue.

libcds, for example, is not an option.

Can TaskContext be used intrusively for a node-based concurrent queue?

Yes, but it will not be compatible with all reclamation systems. TaskContext can be popped from the queue, executed and pushed into the same queue immediately. If the TaskContext is still needed by the queue after being popped, then this will not be possible, because the intrusive hook will still be occupied from the previous time in the queue. One of the queues that does not hold onto the popped nodes is IntrusiveMpscQueue (as it is called in userver).

Anton3 avatar Oct 09 '23 10:10 Anton3

me and @egor-bystepdev will make this task

evlampiy-lavrentiev avatar Nov 12 '23 19:11 evlampiy-lavrentiev