ENH: Parallel mode for monte-carlo simulations
This pull request implements the option to run simulations in parallel to the MonteCarlo class. The feature is using a context manager named MonteCarloManager to centralize all workers and shared objects, ensuring proper termination of the sub-processes.
A second feature is the possibility to export (close to) all simulation inputs and outputs to an .h5 file. The file can be visualized via HDF View (or similar) software. Since it's a not so conventional file, method to read and a structure to post-process multiple simulations was also added under rocketpy/stochastic/post_processing. There's a cache handling the data manipulation where a 3D numpy array is returned with all simulations, the shape corresponds to (simulation_index, time_index, column). column is reserved for vector data, where x,y and z, for example, may be available under the same data. For example, under cache.read_inputs('motors/thrust_source') time and thrust will be found.
Pull request type
- [x] Code changes (bugfix, features)
Checklist
- [ ] Tests for the changes have been added (if needed)
- [x] Docs have been reviewed and added / updated
- [ ] Lint (
black rocketpy/ tests/) has passed locally - [ ] All tests (
pytest tests -m slow --runslow) have passed locally - [ ]
CHANGELOG.mdhas been updated (if relevant)
Current behavior
In the current moment, montecarlo simulations must run in parallel and all outputs a txt file
New behavior
The montecarlo simulations may now be executed in parallel and all outputs may be exported to a txt or an h5 file, saving some key data or everything.
Breaking change
- [ ] Yes
- [x] No
Additional information
None
Benchmark of the results. A machine with 6 cores(12 threads) was used.
Amazing feature, as the results show the
MonteCarloclass has great potential for parallelization.The only blocking issue I see with this PR is the
serializationcode. It still does not support all ofrocketpyfeatures and requires a lot of maintanance and updates on our end.Do you see any other option for performing the serialization of inputs?
@phmbressan we should make all the classes json serializable, it's an open issue at #522 . In the meantime, maybe we could still use the _encoders module to serialize inputs.
I agree with you that implementing flight class serialization within this PR may conflict create maintenance issues for us. The simplest solution would be to delete the flightv1_serializer (and similar) function.
Codecov Report
Attention: Patch coverage is 35.49784% with 149 lines in your changes missing coverage. Please review.
Project coverage is 79.34%. Comparing base (
83aa20e) to head (4e0ef92). Report is 15 commits behind head on develop.
Additional details and impacted files
@@ Coverage Diff @@
## develop #619 +/- ##
===========================================
+ Coverage 76.42% 79.34% +2.92%
===========================================
Files 95 95
Lines 11090 11496 +406
===========================================
+ Hits 8475 9121 +646
+ Misses 2615 2375 -240
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
The
monte_carlo_class_usagenotebook currently does not work with parallel, I did not have time to look into it, and so I did not review the parallel part of the code
I know your review was just temporary, but could you be a bit more specific on the parallel side not working? It might be an OS related issue that we should fix of course, but here things were working fine.
I know your review was just temporary, but could you be a bit more specific on the parallel side not working? It might be an OS related issue that we should fix of course, but here things were working fine.
Open the monte_carlo_class_usage.ipynb and run all cells.
The parameter parallel is set to True, so the simulation runs in parallel.
After the sim is done, nothing is saved to the .inputs.txt or .outputs.txt files
If you set parallel to False instead, the results are saved correctly
I have pushed a fix for the issue on file writing when running on Windows (more accurately on processes spawn mode). I have tested it on a Windows machine and it was running correctly, but I invite reviewers to test also in different OS configs.
Issues solved by this PR:
- [X] MonteCarlo simulations have a parallel mode;
- [X] Both the simulation execution and data saving are executed in parallel (producer - consumer);
- [X] There are performance gains on large simulations;
- [X] The serial simulations can be executed in the same fasion and the outputs of both ways are compatible.
Points of Improvement:
- [ ] Soft Interrupts of parallel simulations (e.g. an exception or Ctrl-C) are only effective on Linux. Spawned processes (Windows) currently are hard stopping.
- [ ] On Windows, the Jupyter notebook will not show the status update prints (running the simulations in a terminal is fine). This seems to be a OS level std output change that is not easily solved.
Some of these points could become issues of the repository. Stating them here for proper PR documentation.
Future Considerations:
Python 3.14and forward will make thespawnthe default start method for all OS. We could change RocketPy start method stay asforkon Linux if this undermines too much the performance;- The Python GIL should be removed some years from now (PEP703), this could bring performance benefits, since Threads are generally faster to start.
@phmbressan I like the way this PR was refactored. Many thanks for your effort.
Please fix the pylint errors and solve all the open conversations in this PR so we can approve and merge it onto develop!
Optionally, try to rebase the PR to get the latest commits from develop.
Converted to draft until you solve the remaining issues, specially the random number generation problem, @phmbressan
I believe this PR is ready again for another round of review. These are the changes since the previous review:
- @phmbressan has done some great work simplifying and optimizing even further the parallel structure, and a
sim_consumerprocess is no longer needed; - @phmbressan and I fixed the random number generator bug. The solution consisted in resetting all stochastic structures inside the
StochasticRocketand their position. The simplest solution we found, without changing things that go directly to eitherRocketandFlight, is implemented in the methods_set_stochasticand__reset_componentsofStochasticRocket, so please take a closer look at both; - a very very minor fix in some of the methods of
Components, just make sure that they make sense.
Overall, it seems that the time per iteration is even faster now, at least by my local measurements. @phmbressan might want to complement the information provided here, he knows this PR much better than I do!
Please, make sure to take a careful look at the Monte Carlo .input file to check that there is indeed no dependency on the generated random variables.
Another important issue: I currently can not interrupt the MonteCarlo.simulate method smoothly when it is run in parallel, all attempts lead to killing the notebook :fearful: ! Would be great to check if the same is happening in your own machines.
Please follow this one: https://github.com/RocketPy-Team/RocketPy/pull/768