Threadpool
Threadpool copied to clipboard
Deadlock due to lack of cross thread signalling
Consider a series of enqueues to a thread pool with two threads: func1()
, stalled_func()
, func2()
, wait a second, func3()
. Execution of func3()
will unstall stalled_func()
.
Since enqueue are round-robin, func1()
and func2()
are enqueued of Thread 1, stalled_func()
and func3()
on Thread 2.
After func2()
, Thread 1 goes to sleep waiting to be signalled sem.acquire_many()
, since _in_flight == 0
. After a second, when func3()
is enqueued, it is pushed to Thread 2's queue and its semaphore is signalled, but it cannot execute func3()
since it is in the middle of executing stalled_func()
. Thread 1 will continue to wait without stealing Thread 2's pending work.
This will cause the thread-pool to deadlock.
If this behaviour is not supported:
Execution of
func3()
will unstallstalled_func()
.
then, that's fair. It's a decent limitation for a thread pool. This behaviour happens when executing a DAG of dynamically connected tasks (e.g.: reading assets from disk). A task node can continue execution only if all its parent nodes have finished execution.
I'm opening an issue for posterity if/when anyone wants to handle this case too.