Threadpool icon indicating copy to clipboard operation
Threadpool copied to clipboard

Deadlock due to lack of cross thread signalling

Open Squadrick opened this issue 1 year ago • 2 comments

Consider a series of enqueues to a thread pool with two threads: func1(), stalled_func(), func2(), wait a second, func3(). Execution of func3() will unstall stalled_func().

Since enqueue are round-robin, func1() and func2() are enqueued of Thread 1, stalled_func() and func3() on Thread 2.

After func2(), Thread 1 goes to sleep waiting to be signalled sem.acquire_many(), since _in_flight == 0. After a second, when func3() is enqueued, it is pushed to Thread 2's queue and its semaphore is signalled, but it cannot execute func3() since it is in the middle of executing stalled_func(). Thread 1 will continue to wait without stealing Thread 2's pending work.

This will cause the thread-pool to deadlock.


If this behaviour is not supported:

Execution of func3() will unstall stalled_func().

then, that's fair. It's a decent limitation for a thread pool. This behaviour happens when executing a DAG of dynamically connected tasks (e.g.: reading assets from disk). A task node can continue execution only if all its parent nodes have finished execution.

I'm opening an issue for posterity if/when anyone wants to handle this case too.

Squadrick avatar Feb 02 '23 11:02 Squadrick