RocketPy
RocketPy copied to clipboard
ENH: Introducing the Monte Carlo class
Pull request type
Please check the type of change your PR introduces:
- [x] Code base additions (bugfix, features)
Pull request checklist
Please check if your PR fulfills the following requirements, depending on the type of PR:
-
Code base additions (for bug fixes / features):
- [x] Tests for the changes have been added
- [x] Docs have been reviewed and added / updated if needed
- [x] Lint (
black rocketpy) has passed locally and any fixes were made - [x] All tests (
pytest --runslow) have passed locally
What is the current behavior?
Currently, the dispersion analysis is only made inside notebooks , requiring different code blocks to be defined every time you set a new Monte Carlo analysis.
What is the new behavior?
The MonteCarlo is added as a class in the code. This is continuing what was developed in the 2022 RocketPy Hackathon at https://github.com/RocketPy-Team/RocketPy-Hackathon-2022/pull/106
Does this introduce a breaking change?
- [x] No! All the previous code should be working.
Other Information:
- This is by far the largest PR we have ever done. A lot of mistakes were introduced and fixed after months. It has been a really try-wait-improve approach.
- Performance: Each simulation is consuming 4-5 seconds, demanding a total time of about 1hour to run 1000 simulations. We should pursue a x4 speed up in future PR (please: don't use this PR to commit speed up improvements, let's finish this first and then work on improvements).
- Tests and docs can still be improved, but the bare minimum to be merged/released is already done. Therefore, reviewers can already start working on this one!
I've updated the task list here (at PR description), but please notice that there's also another huge tasklist at #267 coming!
Check out this pull request on ![]()
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
Task linked: CU-864dp0fnn Dispersion Class
@MateusStano do you have any idea of how to start solving the conflicts? hahahha It seems we have, at least, "a bunch" of problems here >(

PULL REQUEST UPDATE
What is the new behavior?
Here is the main idea behind how the class works:
- Initialize a Dispersion object by passing a file location for the dispersion input, output, and error files
- Run the
run_dispersionmethod:- The objects (Flight, Rocket, Motor, etc) can be passed directly through the method. If not done so, then they must be able to be defined through the
dispersion_dict - The
dispersion_dicthas a specific format:- The keys must be strings with a specific word. These words are the name of input arguments for class definition (with the exception of aero surfaces and parachute, which have a different ways of using inputs)
- If the value of a key is an int or float, it will be treated as the standard deviation for the analysis
- If the value of a key is a tuple, the first item will be seen as that argument's nominal value and the second item as the standard deviation
- If the value is a list, the items in the list will be considered nominal values but will be chosen randomly
- The
dispersion_dictwill pass through a series of checks to see if all classes can be defined with the information given (either through the dict itself or through inputting objects in therun_dispersiosmethod) - Then a different dict, called
modified_dispersion_dictwill be created. This new dict will initially be a copy of thedispersion_dict. Then, in__check_inputted_values_from_dict, the new dict will be filled with (mean, stdev) values where it does not already have these two parameters.- If the (mean, stdev) was given in the
dispersion_dict, then it will simply be copied tomodified_dispersion_dict - If a list was passed as value in the
dispersion_dict, It will also be copied tomodified_dispersion_dict - If an int or float was passed as value in the
dispersion_dict, then it will try to fillmodified_dispersion_dictwith (mean, stdev). First, themeanwill try to be gotten as the attribute of an inputted class. If not able to, it will check if that parameter is required. If it is, raise an error. If it is not, will take the default value as the mean.
- If the (mean, stdev) was given in the
- Dispersion is then run as it was done in notebooks.
- The objects (Flight, Rocket, Motor, etc) can be passed directly through the method. If not done so, then they must be able to be defined through the
Remaining tasks to close this PR:
- [x] Test simulations under different scenarios
- [ ] Improve export_flight_data. This includes
- [ ] Add all exportable attributes to self.exportable_list
- [ ] Add a way to save more than just flight attributes
- [ ] Make saving parachute events optional (currently they are always saved as long as there are parachutes)
- [ ] Find a way to optimize this function by removing for loops or by not having it be run inside the dispersion loop (we are losing a lot of performance on this)
- [x] Check rail buttons addition in
__create_initial_objects - [ ] Check environment variation inside dispersion loop and add an option to have it recalculate the atmospheric model (at the cost of simulation speed)
- [x] Finish documentation
- [x] Improve and review plots and prints
- [x] Add tests
Future dispersion-related implementations
- Allow each parameter to be varied following a specific probability distribution
- Add more options for motor (i.e. Liquid and Hybrids)
- Allow elliptical fins instead of only trapezoidal
- Add initialSolution flight option
- Create evolution plots to analyze convergence
- Integrate with other libraries to plot the map instead of loading a google earth screenshot (e.g. cartopy, ee, etc.)
@MateusStano do you have any idea of how to start solving the conflicts? hahahha It seems we have, at least, "a bunch" of problems here >(
Hahahah yeah this is a mess. I can solve it but it will take a little effort. I think there are a lot of things that changed that the squash merge did not get right.
Will check as soon as possible
@MateusStano I think that... If we don't solve the "allow each parameter to be varied following a specific probability distribution" issue now, we will need to introduce another breaking change in the future since the current "analysis parameter" doesn't support choosing different for each variable. Also, I'm still struggling to become comfortable enough with the current framework we are proposing. It took me a while, but I think I finally got to a new solution, please let me know when you're available to discuss this proposal.
Current framework:
from rocketpy import Environment, Rocket, SolidMotor, Flight, Dispersion
env = Environment()
motor = SolidMotor()
rocket = Rocket(motor)
flight = Flight(rocket, env)
filename = "blablabla"
dispersion = Dispersion(filename)
disp_dictionary = {
# Solid Motor Parameters
"burnOutTime": 0.2,
"totalImpulse": 0.033 * motor.totalImpulse,
"motor_position": (-1.255, 0),
# Rocket Parameters
"mass": 0.100,
"radius": 0.001,
"powerOffDrag": 0.033, # Multiplier
"powerOnDrag": 0.033, # Multiplier
"parachute_Main_CdS": 1,
"parachute_Drogue_CdS": 0.1,
"parachute_Main_lag": 0.1,
"parachute_Drogue_lag": 0.1,
# Flight Parameters
"inclination": 1,
"heading": 2,
}
dispersion.run(
number_of_simulations=50,
dispersion_dictionary=disp_dictionary,
flight=flight,
append=False,
)
- Pros: It is what we already have done in this PR, and it is working. In 80% of the cases, we are using normal distributions only.
- Cons: Doesn't support different distribution function, which can be a complication if the user desires to use normal and uniform distributions for different parameters. Also, it currently includes in the flight_setting generator more than what is needed to be varied, for instance, the generator is currently losing time providing the
Proposed framework:
import numpy as np
from datetime import datetime
from rocketpy import Environment, Rocket, SolidMotor, Flight, Dispersion, Variables
env = Environment()
motor = SolidMotor()
rocket = Rocket(motor)
flight = Flight(rocket, env)
filename = "blablabla"
dispersion = Dispersion(filename)
gravity = Variables(
name="gravity",
dist_func=np.random.normal,
dist_kargs={"loc": 9.8, "scale": 0.005},
)
powerOffDragFactor = Variables(
name="powerOffDragFactor",
dist_func=np.random.normal,
dist_kargs={"loc": 1, "scale": 0.033},
)
mass = Variables(
name="mass",
dist_func=np.random.normal,
dist_kargs={"loc": 20, "scale": 0.005},
)
inclination = Variables(
name="inclination",
dist_func=np.random.normal,
dist_kargs={"loc": 85, "scale": 0.005},
)
fixed = {
"env": {
"date": datetime(2019, 1, 1, 12, 0, 0),
"latitude": 29.7604,
"longitude": -95.3698,
"datum": "WGS84",
},
"motor": {
"...": "...",
"thrust": "thrust_curve.csv",
},
"rocket": {"...": "..."},
"parachute": {"...": "..."},
"flight": {"...": "..."},
}
variables = {
"env": {
"gravity": gravity,
},
"motor": {
"...": "...",
},
"rocket": {
"...": "...",
"mass": mass,
"powerOffDragFactor": powerOffDragFactor,
},
"parachute": {"...": "..."},
"flight": {
"...": "...",
"inclination": inclination,
},
}
dispersion.run(
number_of_simulations=50,
fixed_args=fixed,
variable_args=variables,
flight=flight,
environment=None,
rocket=None,
append=False,
)
- Pros: Flexible to use any distribution from np.random in each variable. Separate the constant (fixed) values from the variables, reducing the time for looping. It is easier to explain to new users how to create the different required dictionaries, rather than having only one "dispersion_dict" argument with different contents. The Variables class is created to abstract the definition of all the input parameters.
- Cons: More work to be done here. The setup of a dispersion simulation can take more lines, as you can notice by the above examples.
Last comment to try helping your review: Please imagine 2 different dispersion scenarios: 1 with just 2 or 3 parameters being varied, and other with all the parameters being varied. The solution should work well in both cases, but especially in the first one.
Here is how I imaged the Variables class. Let's discuss by voice before commiting or refusing anything, please!
class Variables:
"""Class to define variables for Monte Carlo simulations. Each variable is
defined by a distribution function and its parameters. Distributions are
defined by the numpy.random module. To see the available distributions,
please refer to the numpy documentation. #TODO: Add link.
Once the variables are defined, they can be used to generate random numbers
using the __call__ method.
Parameters
----------
Variables.name : str
The name of the variable.
Variables.dist_func : np.random._function_
The distribution function to be used.
Variables.dist_kargs : dict
The parameters of the distribution function. To see the available
parameters, please refer to the numpy documentation. #TODO: Add link
Returns
-------
None
Examples
--------
>>> from rocketpy import Variables
>>> var = Variables('name', np.random.normal, {'loc': 0, 'scale': 1})
>>> var() # Returns a random number considering the distribution
"""
def __init__(self, name, dist_func, dist_kargs):
"""Initialize the object.
Parameters
----------
name : str
The name of the variable.
dist_func : np.random._function_
The distribution function to be used.
dist_kargs : dict
The parameters of the distribution function. To see the available
parameters, please refer to the numpy documentation. #TODO: Add link
Returns
-------
None
"""
self.name = name
self.dist_func = dist_func
self.dist_kargs = dist_kargs
def __str__(self):
return f"Variable {self.name} with distribution {self.dist_func} defined by {len(self.dist_kargs.keys())} parameters"
def __call__(self):
return self.dist_func(**self.dist_kargs)
Now, regarding the task I've been working on, here is my progress, please let me know if they completely address the issues or not.
Improve export_flight_data. This includes 1 - Add all exportable attributes to self.exportable_list
Done! I simply added the most important parameters to self.standard_otputs
2 - Add a way to save more than just flight attributes
I understand your point, but I don't see any results from other classes that we could include in the outputs file. I mean:
- Can you name any result from other classes that would be interesting to be included in outputs?
- Environment, rocket and motor are also present at the flight.dict, if we need we can access them from that dict.
- The input file already has all the input parameters used to define each of these objects.
3 - Make saving parachute events optional (currently they are always saved as long as there are parachutes)
Done! At least for the outputs, it is done. I just don't have a smart idea of how to remove the parachute info from inputs file without using another loop.
4 - Find a way to optimize this function by removing for loops or by not having it be run inside the dispersion loop (we are losing a lot of performance on this)
I've implemented a map function to do the job of the old "for key, value in dict.items()", and I think. Did it save us a huge amount of time? Well, no. But I think it helped. I also removed some unnecessary for loops anyway.
More than 700 commits merged into this branch. Well, it wasn't easy... But it's done.
Step 1: Tests needs to be fixed + then we finish any adjustments regarding v1.0 Step 2: finish up the class methods Step 3: discuss implementation Step 4: final documentation, add more examples, merge and release
Current status:
- [x] Removing pydantic from dependencies: Environment, SolidMotor
- [x] Removing pydantic from dependencies: Rocket, Aero surfaces.
- [x] Add stocastic generic motors
- [x] Tests (@Gui-FernandesBR )
- [x] Documentation
@MateusStano please let me know when I can start working on the new tests here :)
Codecov Report
Attention: Patch coverage is 75.71090% with 205 lines in your changes are missing coverage. Please review.
Project coverage is 73.60%. Comparing base (
64bdab2) to head (209434f). Report is 5 commits behind head on develop.
Additional details and impacted files
@@ Coverage Diff @@
## develop #232 +/- ##
===========================================
+ Coverage 73.38% 73.60% +0.22%
===========================================
Files 57 70 +13
Lines 9461 10310 +849
===========================================
+ Hits 6943 7589 +646
- Misses 2518 2721 +203
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
This PR introduces a new feature: bc73117
It is the private class named RocketPyEncoder(json.JSONEncoder)
Sorry for doing that, but we would need it eventually anyways.
I think the tests will pass now. There is a bug in the environment analysis test, unfortunately. But I wouldn't change it in here.
See: https://stackoverflow.com/questions/3768895/how-to-make-a-class-json-serializable
Something that I'm not comfortable with: Wind Speed factors and the assumption they are normal distributed + the same uncertainty is applied to the whole wind profile.
Hey, I think we should add a warning in the MonteCarlo initialization saying that the class is still too young and we may change its behavior in eventual future minor releases. After releasing and people start using, we can evaluate when to temove the warning.
We did the very same thing with the EnvironmentAnalysis class.
What do you think? @phmbressan @MateusStano @Lucas-Prates
Hey, I think we should add a warning in the MonteCarlo initialization saying that the class is still too young and we may change its behavior in eventual future minor releases. After releasing and people start using, we can evaluate when to temove the warning.
We did the very same thing with the EnvironmentAnalysis class.
What do you think? @phmbressan @MateusStano @Lucas-Prates
Makes sense to me. This class probably will go through much feedback and, depending on what features we want to introduce, might create "breaking changes" on how the input is parsed and provided by the user. For instance, we currently only accept marginal distributions for the parameters. If, in the future, we want to implement joint distributions, I believe this would require another way of dealing with the input.
While trying to refactor my PR to use the Monte Carlo class, one thing I found missing is to be able to export the trajectories. For the notebook mentioned, this information is not critical and I can omit its use. @MateusStano, @Gui-FernandesBR, @phmbressan, do you think this information would be important for future use of the class? Exporting the trajectories is a bit harder since it requires some "convention." What I did in my rough first implementation was to fix a sampling rate and sample all the trajectories at multiples of that time step from 0 to the t_final of the reference trajectory. In our case, this upper bound could be max(t_final_array) where t_final_array is an array of all t_final of all trajectories sampled. That would require, however, to store all flight objects until the end.
While trying to refactor my PR to use the Monte Carlo class, one thing I found missing is to be able to export the trajectories. For the notebook mentioned, this information is not critical and I can omit its use. @MateusStano, @Gui-FernandesBR, @phmbressan, do you think this information would be important for future use of the class? Exporting the trajectories is a bit harder since it requires some "convention." What I did in my rough first implementation was to fix a sampling rate and sample all the trajectories at multiples of that time step from 0 to the t_final of the reference trajectory. In our case, this upper bound could be max(t_final_array) where t_final_array is an array of all t_final of all trajectories sampled. That would require, however, to store all flight objects until the end.
I believe this is something we do in the future, not so urgent. At a first glance, I think saving the trajectory positions coordinates may be too expensive.
After 375 commits and more than 600 days, this PR is finally ready for a last review and merge.
Calling out all the reviewers @RocketPy-Team/code-owners so we can have a last double-check in this code before going to the develop branch.
Hey! I noticed quite a lot of important documentation was missing so I took the liberty to change a lot of the usage notebook
It is still very far from being good I think, but I hope things are better explained now. It is very important that someone reviews what I wrote in there carefully, and improves on anything that needs it
There are still a couple of problems I found
- When you write the name of a stochastic object in the last line of a cell in the notebook, it gets the
__repr__of the class, which is not very useful. I would say leaving the__repr__as what now is the__str__would be really important - The
__str__of aStochasticRocketwith added motor, aerosurfaces, and/or parachutes is a mess right now - Still need to add an explanation on how to add an image to
dispersion.plot.ellipsesand how to translate that image in the plot - Need to add an explanation of how to set the
export_listof theMonteCarloclass and how these are saved on the files. Also, still needs to add docs explaining what the default export_list is and what are the possible exportable variables - The
export_ellipses_to_kmlraises an error for me
Hey! I noticed quite a lot of important documentation was missing so I took the liberty to change a lot of the usage notebook
It is still very far from being good I think, but I hope things are better explained now. It is very important that someone reviews what I wrote in there carefully, and improves on anything that needs it
https://www.productplan.com/glossary/minimum-viable-product/
There are still a couple of problems I found
- When you write the name of a stochastic object in the last line of a cell in the notebook, it gets the
__repr__of the class, which is not very useful. I would say leaving the__repr__as what now is the__str__would be really important
I think I temporarily addressed this issue here: 521a8e8 Can be improve in future PRs.
- The
__str__of aStochasticRocketwith added motor, aerosurfaces, and/or parachutes is a mess right now The problem is that a__str__ideally is not so complex as we have today. You can see as an example the Function class, which has a really simple "str". If you want so badly a method to visualize what is inside a StochasticModel, then we should probably add a method: "your_stochastic_object.visualize_variables()" or similar names.
- Still need to add an explanation on how to add an image to
dispersion.plot.ellipsesand how to translate that image in the plot
Can be done in the future. I don't really know if we should allow the user to input an image or simply download an satellite image at the moment we run the function. Let it to the future.
- Need to add an explanation of how to set the
export_listof theMonteCarloclass and how these are saved on the files. Also, still needs to add docs explaining what the default export_list is and what are the possible exportable variables
This requires time and effort, let's leave it to the next PR of documentation.
- The
export_ellipses_to_kmlraises an error for me
What error? How can we reproduce it? ... It is running flawlessly in my machine, including the tests. Also, this is an extra feature. Can be fixed in another PR.
Ok, thank you for the review and the updates in the .ipynb file, @MateusStano .
As agreed, let's follow through this PR, approve and merge it. I'm working on documentation improvements and this will be addressed in a new PR.
Awesome! This is finally ready to be merged. Probably the biggest PR we ever had, and we never make anything as big as this again heheheh