Thread pools and WebAssembly
Two commits:
Add num_threads_started() to thread_pool
When running in the browser, creating threads requires returning control to the browser. Calling std::thread::create() and waiting (forever) does nothing. So the required procedure is this:
- Call from JavaScript into native code (where dlib runs compiled into WebAssembly)
- Native code calls
std::thread::create() - Return to JavaScript
- Set a timer with a callback in JavaScript (just busy-waiting in JavaScript doesn't do anything either)
- In the callback, call into native code and check if the thread has started
- Do the work
Emscripten (emcc) can optionally transparently create a thread pool for you before the native code is run (where std::thread::create() will pull from), but the size of the pool must be determined at compile time, and it litters the user's browser with unused threads. Also, the browser limits the number of threads that can be created (typically (n_cpu / 2) + 4) and you don't know the number of cpu cores the user has before hand, so that is tricky.
Luckily worker_thread_ids.size() gives us exactly the info we need to be able to handle this ourselves, so we add num_threads_started() that exposes this information. Now in the timer-based callback into native code we can check if num_threads_started() == num_threads_in_pool() and we are good to go.
Move thread creation and joining from thread_pool internals to an external thread_pool_thread_manager
As you probably understand from the above, having multiple libraries inside your WebAssembly code managing their own thread pools can be problematic. So when the previous commit is not sufficient, this commit allows us to manage our own thread pools entirely, while still being able to use multithreading with calls like find_max_global.
We do this by moving std::thread::create() and std::thread::join() outside of the thread_pool internals. By subclassing the thread_pool_thread_manager class essentially it allows us to run thread_pool::thread() as a payload in threads from our own external thread pools, where semantically creating a thread becomes acquiring it, and joining it becomes releasing it. It moves all responsibility for these things away from dlib.
I've tried implementing this in a couple of different ways but ultimately settled on using a pointer to this class (default nullptr, in which case thread_pool creates and releases its own). IMHO it's less pretty than using callbacks, object references, or constructed objects as default parameters, but holding and managing std::thread references is tricky in callbacks, using a global default object doesn't seem ideal (and it would have to be implemented better than this default implementation), and constructing an object inside the default parameter leads to several unnecessary allocations.