cppcoro icon indicating copy to clipboard operation
cppcoro copied to clipboard

spurious wake-ups in static_thread_pool cause deadlock

Open saurik opened this issue 5 years ago • 3 comments

In static_thread_pool.cpp, there is a comment about how "the auto_reset_event used to wake up this thread may already be in the 'set' state" and a defense that this "won't really hurt".

https://github.com/lewissbaker/cppcoro/blob/18e02bd0cd09162aa71956d85377454d3a33e8f5/lib/static_thread_pool.cpp#L469-L473

However, what this means is that m_sleepingThreadCount won't get updated correctly: for a thread pool with a single thread (an easy test), m_sleepingThreadCount should be 1 as we enter localState.sleep_until_woken and 0 when we exit it, but if the auto_reset_event used to wake up the thread is already in the set state, we will exit the wait with m_sleepingThreadCount still equal to 1.

m_isSleeping is also true, which is awkward, but that will resolve itself (as in, will become the right value) when we next go to sleep... but that will also bump m_sleepingThreadCount up to 2 :(. When this thread next gets woken up, m_isSleeping will get set to false; but the decrement of m_sleepingThreadCount will only bring it back to 1: so we are now falsely counting a sleeping thread.

At this point, if another thread attempts to schedule into this thread pool while this thread is running, wake_one_thread will see m_sleepingThreadCount as 1. It will decrement that to 0, and think it is in charge of waking up a thread. However, there are no threads to wake up, as the m_isSleeping of all threads is false. This thread will then spin loop inside of wake_one_thread until we go back to sleep.

This is at least sort of OK? That spin loop might be going on for arbitrarily long (which sucks), but at least this will decrement m_sleepingThreadCount back to 0, and the situation will resolve itself... BUT, if the thread attempting to schedule itself to this pool is the current thread (which is normally an idempotent operation), it will livelock in that spin loop: the current thread will never sleep.

Put simply, if the auto_reset_event used to wake up a thread is already in the 'set' state, leaving it in that state actually hurts quite a bit ;P, and is leading to lots of wasted CPU in spinloops and even livelocks. I can add checks in my code "don't re-schedule to the current thread" to work around the latter, but that won't help with the former: I'd argue the tracking variables should always just be correct.

(I have no clue what an obvious fix is for this issue, and so will not attempt to propose one... the solution I would have come up with would have almost certainly been much slower than your library, as I have not yet attempted to learn how to do lock-free concurrency ;P.)

saurik avatar May 28 '19 04:05 saurik

I can add checks in my code "don't re-schedule to the current thread" to work around the latter...

OK, this actually doesn't help :(. If a different thread attempts to schedule itself and ends up stuck spinning for the thread-that-isn't-asleep-but-is-claiming-to-be-asleep to go to sleep (which was a state it wasn't supposed to have ever entered: the operation in question wasn't supposed to block, making what I'm about to describe safe), and while it is spinning something in the running thread from the thread pool in turn blocks back on it (which again, should have been safe, as the operations the other thread is doing should not have blocked: if m_sleepingThreadCount were 0, it would return immediately, and it should only be >1 if there is a sleeping thread for it to wake up) then the thread pool is now in a dead lock and the other thread that wanted to schedule something is stuck in the livelock.

saurik avatar May 28 '19 04:05 saurik

Thanks for the detailed analysis.

So to summarise the problem occurs in the following situation:

  • Thread, T1, runs out of work and calls notify_intent_to_sleep() which sets m_isSleeping = true and then increments m_sleepingThreadCount.
  • Then some other thread, T2, enqueues some work to the queue and tries to wake up T1, decrementing m_sleepingThreadCount, setting T1's mi_isSleeping flag to false and setting the auto-reset event to 'set'.
  • Then thread T1 picks up the queued work in call to tryGetRemote().
  • Then thread T1 calls try_clear_intent_to_sleep() which fails to clear the 'is-sleeping' flag since it's already cleared by T2.
  • Then T1 continues, leaving the auto-reset event in the 'set' state.
  • Later T1 runs out of work again and calls notify_intent_to_sleep() which sets m_isSleeping = true and increments m_sleepingThreadCount.
  • It then finds no more work to do and calls sleep_until_woken().
  • However, as the auto-reset event was already in the 'set' state this completes immediately and T1 goes around the loop again, finds there is still nothing to do.
  • T1 calls notify_intent_to_sleep() again which increments m_sleepingThreadCount and sets m_isSleeping to true before then calling sleep_until_woken() which, now that the auto-reset event is not set will actually put the thread to sleep.
  • However, now m_sleepingThreadCount has now been incremented twice, which will cause problems for the next thread to try to schedule some work to the pool as there is only one thread that has m_isSleeping set to true. This could lead to unnecessary spinning or possibly livelock in wake_one_thread().

I can see two possible solutions here: The first is to conditionally increment the m_sleepingThreadCount only if m_isSleeping flag was previously false. i.e. change the m_isSleeping.store(true, std::memory_order_relaxed) to m_isSleeping.exchange(true, std::memory_order_relaxed) and only increment m_sleepingThreadCount if the previous value was false.

This has the downside of incurring an extra atomic-exchange operation to set the m_isSleeping flag so might have a small impact on performance. It's only called when the thread is going to sleep anyway so might not be noticeable.

This also has the effect of having an active thread look like it's asleep. This can mean that another idle thread might not be woken up next time an item is queued as the thread that enqueued a task might "wake up" the thread that is already awake.

The second is to have the try_clear_intent_to_sleep() function detect the case where some other thread has set its m_isSleeping flag to true and return false to indicate that we couldn't clear the flag and in this case actually wait on the auto-reset event to make sure that its state is set back to 'not set' before continuing. eg. change the calling code to:

if (!try_clear_intent_to_sleep(threadIndex))
{
  // Some other thread acquired responsibility for waking us up.
  // Wait until they finish doing that so that we can make sure the
  // auto-reset event is reset back to 'not set' _after_ the other thread
  // has called 'set'.
  localState.sleep_until_woken();
}
goto normal_processing;

This would have the downside that it might introduce an extra context switch and OS call in the (hopefully rare) case where another thread that is waking up this thread is context-switched out between setting m_isSleeping to false and signalling the auto-reset event. The current thread will then go to sleep until the other thread is rescheduled by the OS and signals the event. If this race is indeed rare then this may also not be an issue.

lewissbaker avatar May 28 '19 18:05 lewissbaker

So, I ended up not having to care about this bug, because I simply ended up writing my own scheduler (and in fact later ended up writing my own task, and kind of have my usage of cppcoro only around now to provide me some support for generators)... but I have friends who want to be able to start using all of these fun features of C++ co_await, and I'm thereby at a loss as to what to tell them... they all know my stories about this static_thread_pool issue--which, as far as I can tell, was never worked on--and so cppcoro can't really be trusted under any kind of heavy load :(. Is there a different library I should be referring people to in 2021? (It is extra-disappointing, as I just hoping to use multi_producer_sequencer for something I was writing... but I see that that for some reason cares about the scheduler, and all the examples use static_thread_pool--which I know doesn't work--and so I'm guessing I'll need to write my own mechanism.)

saurik avatar Apr 02 '21 05:04 saurik