Dispatcher.jl
Dispatcher.jl copied to clipboard
Out-of-core executors
I'm interested in using this package as the underlying task execution framework for Dagger.jl. I think a smarter scheduler will be needed so that Dagger can remain out-of-core...
If I understand correctly, the default dispatch!(::Executor, ::DispatchContext) simply uses asyncmap to execute the nodes in the context. The node ordering is implicitly handled by calls to wait...
It will be nice to have an Executor that can execute nodes in a parallel depth-first manner as dask and Dagger do enforcing better data locality and possibility of cleaning up of intermediate data as soon as possible.
Unrelated, but I like the attention paid to documentation and error handling in this package. Great stuff! :)
Unrelated, but I like the attention paid to documentation and error handling in this package. Great stuff! :)
Thanks!
The plan for executor improvements is to have one which uses the dask.distributed scheduling service to determine when and where to schedule jobs, as it appears to be very advanced and supports a lot of our wishlist features.
I'm not sure what you mean by "depth-first" scheduling in the context of a generic DAG. Are you talking about attaching a specific traversal to the available workers or just preferring to execute direct descendant nodes when a worker running a node becomes available? The former could be added into the existing code without much challenge but the latter would probably be best attached to the dask.distributed work.
See https://github.com/invenia/Dispatcher.jl/issues/11
In general I would love to sync up our packages! Dagger has the distributed-array stuff that Dispatcher doesn't have and Dispatcher has the generic delayed-computation interface that we needed for our system.
I'm not sure what you mean by "depth-first" scheduling in the context of a generic DAG.
Oh I think that name is confusing. I meant a simple dask-like scheduler indeed.
Yes, it would be really good for Dagger to use Dispatcher. One of the many benefits is being able to easily switch between different executors. It is conceivable that a threaded executor may be implemented once Julia has decent multi-threading support.
If you haven't come across it already, we identified a potentially problematic situation in Julia's asynchronous communication. It might show up when you start integrating dask.distributed, the discussion is here: https://github.com/JuliaParallel/Dagger.jl/issues/53 might be worthwhile for you to know about it.