Graphs.jl
Graphs.jl copied to clipboard
Proposal: Execution policy type for selecting parallel computation policy
Parallel computation is currently in a messy state. The functions that we have are distributed between the Parallel and the Experimental submodules but we do not export any symbols yet; that means that they are considered internal and are not subject to semantic versioning. But that probably also means that barely any one is using them.
Some algorithms currently allow the user to choose between threaded (tasks distributed between multiple threads on the same machine) and distributed computation (potentially multiple multiple machines). We don't have any mechanism to select how many threads/cores are used and which ones we want to use. This is currently done with the keyword argument parallel=:threads or parallel=:distributed.
What I would like to propose is that we create a type to select what kind of execution policy we want. This means something like:
# Name is inspired from C++, but we can also use something like Execution or Parallel
abstract type ExecutionPolicy end
# No parallel computation
struct Singular <: ExecutionPolicy end
# Using threads
struct Threads <: ExecutionPolicy end
# Using the Distributed module
struct Distributed <: Distributed end
# The algorithm should select the best policy
struct Auto <: ExecutionPolicy end
With this
- We can dispatch on the policy - or a union of a subset of the policies.
- We can add further information like the number of cores or the core ids later
- We have an unified mechanism that would select the best policy for a certain algorithm - not only the parallel ones but also some of the sequential ones.
- Other packages can define even more policies - for example a a
GPUpolicy if someone implements graph algorithms on a GPU.
Furthermore we should add
- Some helper functions to select the best policy according to some constraints
- Useful warnings if for example someone selects the
Threadspolicy but there is only one thread available.
I would like to hear peoples feedback on this. Also perhaps someone knows another Julia package that implements something similar?
Thanks for writing this up. I am strongly in favor.
-
One minor UX consistency nit that might need to be spelled out in more detail. What would be the rule for where to place the dispatch argument. Always at the end? -- ugly because different functions take different number of arguments so some consistency is lost; Always first? -- ugly because it distracts from the domain-specific interesting arguments of the function. This is not a big problem, but probably should be added to the style guide.
-
How does this mesh with having multiple algorithms? E.g. say there is some function
calculate_property(::Graph, ::AbstractAlgorithm, ::ExecutionPolicy)whereAlg1 <: AbstractAlgorithmworks with all execution policies, andAl2 <: AbstractAlgorithmworks only withSingular.
- same question as 1 about where to place the
::AbstractAlgorithmin the argument order and having that philosophy documented in the docs and in the style guide. - how do we automate having nice error messages without having to make dummy methods that just raise errors (some smart error hint that filters on being triggered by ::AbstractAlgorithm and ::ExecutionPolicy in
__init__maybe?)
- May we rename it to
AbstractExecutionPolicy?
Thanks for writing this up. I am strongly in favor.
1. One minor UX consistency nit that might need to be spelled out in more detail. What would be the rule for where to place the dispatch argument. Always at the end? -- ugly because different functions take different number of arguments so some consistency is lost; Always first? -- ugly because it distracts from the domain-specific interesting arguments of the function. This is not a big problem, but probably should be added to the style guide. 2. How does this mesh with having multiple algorithms? E.g. say there is some function `calculate_property(::Graph, ::AbstractAlgorithm, ::ExecutionPolicy)` where `Alg1 <: AbstractAlgorithm` works with all execution policies, and `Al2 <: AbstractAlgorithm` works only with `Singular`. * same question as 1 about where to place the `::AbstractAlgorithm` in the argument order and having that philosophy documented in the docs and in the style guide. * how do we automate having nice error messages without having to make dummy methods that just raise errors (some smart error hint that filters on being triggered by ::AbstractAlgorithm and ::ExecutionPolicy in `__init__` maybe?) 3. May we rename it to `AbstractExecutionPolicy`?
- Right I did not think that completely true - I think the execution policy should probably be a keyword argument - but these do not work with multiple dispatch - so perhaps that idea is out or we have
alg(...; execution policy=Auto) = alg(standard_policy(auto), ...)
standard_policy(policy) = Singular
alg(Singular, ...) =
alg(Threads, ...) =
Tbh this already looks quite convoluted to me. So perhaps multiple dispatch is not a good argument here for now.
- We don't have
AbstractAlgorithmanywhere in our code right now as far as I know? The idea was floating around for a while I think. And yes, this might be another argument against dispatching on the policy - so perhaps for now it should should just be a kwargs argument and some if clauses.
We could have some traits such as
supports_policy(::AbstractAlgorithm, ::ExecutionPolicy)::Bool
but again that makes things complicated and from my experience too much abstraction here usually leads nowhere - so perhaps this is something for another day.
- Are you talking about
Base.Experimental.register_error_hint? I have played around with that yet - but it is probably worth having a look at it also for other cases.MethodErroris often confusing - especially if it happens deep within a stack trace.
But we can also create helper functions like
raise_if_unsupported_policy(::policy, singular::Bool, threads::Bool, distributed::Bool)
for example.
- I don't like having long names all the time - so perhaps even
Executionmight be a better name (but that is not a strong opinion- there are also case like that in Julia Base, for exampleBase.Order.Ordering` is an abstract type.
Yup, I was referring to register_error_hint. I am using it in a few other packages and I am happy with the results. I think using that can remove some of the complexity you alluded to.
I concur about long names.
This sounds good to me. Missing methods for some algorithms/execution policy combinations is fine as long as there is a good error message.