Potential fix when master has compute intensive work and must schedule workers
Fixes issue #206 , please see the issue description for an explanation of the fix.
@andreasnoack Any thoughts on this?
It's a good observation and a pretty simple though not super pretty fix.
I'm wondering if we with the new multithreading can now just delegate all the scheduling to a separate task that won't block while the local work is being executed. I'd like to hear @vchuravy 's thoughts.
Looking at the code, the pattern
@sync for i in pids
@async remotecall_fetch(**do_work**,i,...)
is common (and natural). So this may happen anywhere where **do_work** is heavy. I guess adding yield() in the correct places would work...
Or, at construction of DArray, by convention, have the id==myid() be last and preserve the invariant, pid[i] holds chunck i.
Cheers!
I think we need to carefully go through Distributed.jl and look at whether we can start using @spawn instead of @async, and then do the same for DistributedArrays.jl
Won't be easy since a whole bunch of this code is based on cooperative tasking, and switching to parallelism will expose races.
I might be able to have a UROP look at this transition.