parsec
parsec copied to clipboard
Offload device task release to worker threads
Add a LIFO for task activities that are high-priority to the context. These activities are picked up by worker threads. With GPU execution, worker threads are mostly idle so they can spare cycles handling the release of successor tasks, including potential communication.
A similar mechanism could apply to incoming communication to relieve the communication thread and offload task release upon completion of a remote dep receive.