pygmo2 icon indicating copy to clipboard operation
pygmo2 copied to clipboard

Dask Integration [FEATURE]

Open franciscblubaugh opened this issue 3 years ago • 8 comments

I recently came across this library in a technical talk. I frequently use the Dask parallel processing engine to scale my work across multiple machines. Is there any plans to expand the multiprocessing tasking to leverage something like Dask or MPI for cluster based optimization?

franciscblubaugh avatar Jun 16 '22 14:06 franciscblubaugh

Hi @franciscblubaugh,

We are successfully using Dask with Pygmo2 in our project Pyxel (https://gitlab.com/esa/pyxel and https://esa.gitlab.io/pyxel/). It works well in a single computer and a grid of computers (84 cores)

We have developed our own user-defined BFE (Batch Fitness Evaluator) and user-defined Island using Dask.

Since January our project is open-source (MIT License), you can find these user-defined BFE and Island here https://gitlab.com/esa/pyxel/-/blob/master/pyxel/calibration/user_defined.py

This code could/should be integrated in Pygmo2.

What do Pygmo contributor think ?

flemmel avatar Oct 01 '22 07:10 flemmel

What do Pygmo contributor think ?

We would certainly welcome PRs in this sense :)

bluescarni avatar Oct 03 '22 12:10 bluescarni

Nice !

I will create a Pull Request !

flemmel avatar Oct 03 '22 16:10 flemmel

I am successfully using Pygmo2 on my local PC (single machine) and am very satisfied with the parallel optimization performance. As part of my master’s thesis, I intend to conduct parallel optimizations using Pygmo2 on the university's HPC (multi machine). I have explored the Dask integration / extension in Pygmo2 as described in Pyxel. I have a general understanding of the process, but I still have various difficulties with the implementation in my code.

@flemmel and @bluescarni

  • An official Dask integration in Pygmo2 would be highly desirable. Is it still planned?
  • Do you have basic Dask Pygmo2 integration / extension examples other than Pyxel.py itself?

@bluescarni I am a beginner regarding parallelizing Python code. Based on the Pygmo2 capabilities description, I assumed that the library already runs natively on HPCs (Multi Machine).

  • Are there approaches with less overhead than Dask to execute Pygmo2 on HPCs?
  • How do you work with Pygmo2 on an HPC? Any simple examples are appreciated.

Thanks for your help!

IvoSteiner avatar Dec 02 '23 07:12 IvoSteiner

* An _official_ Dask integration in Pygmo2 would be highly desirable. Is it still planned?

No concrete plans at the moment.

* Do you have basic Dask Pygmo2 _integration / extension_ examples other than Pyxel.py itself?

Dask integration would mean implementing a user-defined island that distributes the evolutions via Dask. We have several user-defined islands implemented in pygmo already:

https://github.com/esa/pygmo2/blob/master/pygmo/_py_islands.py

See also the island documentation for information on the API that a user-defined island needs to implement:

https://esa.github.io/pygmo2/island.html

* Are there approaches with less overhead than Dask to execute Pygmo2 on HPCs?
* How do you work with Pygmo2 on an HPC? Any simple examples are appreciated.

We have an ipyparallel island which can be used on HPC setups:

https://esa.github.io/pygmo2/islands.html#pygmo.ipyparallel_island

We don't have however much experience/user feedback regarding HPC deployments...

bluescarni avatar Dec 04 '23 14:12 bluescarni

Thank you for the prompt response. I will take a closer look at the concepts you mentioned. I will reach out again if I have any new insights regarding the HPC deployment. However, unfortunately, it no longer has the highest priority in my thesis.

IvoSteiner avatar Dec 06 '23 14:12 IvoSteiner