RocketPy icon indicating copy to clipboard operation
RocketPy copied to clipboard

ENH: Introducing the Monte Carlo class

Open MateusStano opened this issue 3 years ago • 21 comments

Pull request type

Please check the type of change your PR introduces:

  • [x] Code base additions (bugfix, features)

Pull request checklist

Please check if your PR fulfills the following requirements, depending on the type of PR:

  • Code base additions (for bug fixes / features):

    • [x] Tests for the changes have been added
    • [x] Docs have been reviewed and added / updated if needed
    • [x] Lint (black rocketpy) has passed locally and any fixes were made
    • [x] All tests (pytest --runslow) have passed locally

What is the current behavior?

Currently, the dispersion analysis is only made inside notebooks , requiring different code blocks to be defined every time you set a new Monte Carlo analysis.

What is the new behavior?

The MonteCarlo is added as a class in the code. This is continuing what was developed in the 2022 RocketPy Hackathon at https://github.com/RocketPy-Team/RocketPy-Hackathon-2022/pull/106

Does this introduce a breaking change?

  • [x] No! All the previous code should be working.

Other Information:

  • This is by far the largest PR we have ever done. A lot of mistakes were introduced and fixed after months. It has been a really try-wait-improve approach.
  • Performance: Each simulation is consuming 4-5 seconds, demanding a total time of about 1hour to run 1000 simulations. We should pursue a x4 speed up in future PR (please: don't use this PR to commit speed up improvements, let's finish this first and then work on improvements).
  • Tests and docs can still be improved, but the bare minimum to be merged/released is already done. Therefore, reviewers can already start working on this one!

MateusStano avatar Sep 16 '22 03:09 MateusStano

I've updated the task list here (at PR description), but please notice that there's also another huge tasklist at #267 coming!

Gui-FernandesBR avatar Nov 12 '22 20:11 Gui-FernandesBR

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Task linked: CU-864dp0fnn Dispersion Class

Gui-FernandesBR avatar Jan 17 '23 19:01 Gui-FernandesBR

@MateusStano do you have any idea of how to start solving the conflicts? hahahha It seems we have, at least, "a bunch" of problems here >(

image

Gui-FernandesBR avatar Jan 26 '23 10:01 Gui-FernandesBR

PULL REQUEST UPDATE

What is the new behavior?

Here is the main idea behind how the class works:

  • Initialize a Dispersion object by passing a file location for the dispersion input, output, and error files
  • Run the run_dispersion method:
    • The objects (Flight, Rocket, Motor, etc) can be passed directly through the method. If not done so, then they must be able to be defined through the dispersion_dict
    • The dispersion_dicthas a specific format:
      1. The keys must be strings with a specific word. These words are the name of input arguments for class definition (with the exception of aero surfaces and parachute, which have a different ways of using inputs)
      2. If the value of a key is an int or float, it will be treated as the standard deviation for the analysis
      3. If the value of a key is a tuple, the first item will be seen as that argument's nominal value and the second item as the standard deviation
      4. If the value is a list, the items in the list will be considered nominal values but will be chosen randomly
    • The dispersion_dict will pass through a series of checks to see if all classes can be defined with the information given (either through the dict itself or through inputting objects in the run_dispersios method)
    • Then a different dict, called modified_dispersion_dict will be created. This new dict will initially be a copy of the dispersion_dict. Then, in __check_inputted_values_from_dict, the new dict will be filled with (mean, stdev) values where it does not already have these two parameters.
      • If the (mean, stdev) was given in the dispersion_dict, then it will simply be copied to modified_dispersion_dict
      • If a list was passed as value in the dispersion_dict, It will also be copied to modified_dispersion_dict
      • If an int or float was passed as value in the dispersion_dict, then it will try to fill modified_dispersion_dict with (mean, stdev). First, the mean will try to be gotten as the attribute of an inputted class. If not able to, it will check if that parameter is required. If it is, raise an error. If it is not, will take the default value as the mean.
    • Dispersion is then run as it was done in notebooks.

Remaining tasks to close this PR:

  • [x] Test simulations under different scenarios
  • [ ] Improve export_flight_data. This includes
    • [ ] Add all exportable attributes to self.exportable_list
    • [ ] Add a way to save more than just flight attributes
    • [ ] Make saving parachute events optional (currently they are always saved as long as there are parachutes)
    • [ ] Find a way to optimize this function by removing for loops or by not having it be run inside the dispersion loop (we are losing a lot of performance on this)
  • [x] Check rail buttons addition in __create_initial_objects
  • [ ] Check environment variation inside dispersion loop and add an option to have it recalculate the atmospheric model (at the cost of simulation speed)
  • [x] Finish documentation
  • [x] Improve and review plots and prints
  • [x] Add tests

Future dispersion-related implementations

  • Allow each parameter to be varied following a specific probability distribution
  • Add more options for motor (i.e. Liquid and Hybrids)
  • Allow elliptical fins instead of only trapezoidal
  • Add initialSolution flight option
  • Create evolution plots to analyze convergence
  • Integrate with other libraries to plot the map instead of loading a google earth screenshot (e.g. cartopy, ee, etc.)

MateusStano avatar Jan 26 '23 16:01 MateusStano

@MateusStano do you have any idea of how to start solving the conflicts? hahahha It seems we have, at least, "a bunch" of problems here >(

Hahahah yeah this is a mess. I can solve it but it will take a little effort. I think there are a lot of things that changed that the squash merge did not get right.

Will check as soon as possible

MateusStano avatar Jan 26 '23 16:01 MateusStano

@MateusStano I think that... If we don't solve the "allow each parameter to be varied following a specific probability distribution" issue now, we will need to introduce another breaking change in the future since the current "analysis parameter" doesn't support choosing different for each variable. Also, I'm still struggling to become comfortable enough with the current framework we are proposing. It took me a while, but I think I finally got to a new solution, please let me know when you're available to discuss this proposal.

Current framework:

from rocketpy import Environment, Rocket, SolidMotor, Flight, Dispersion

env = Environment()

motor = SolidMotor()

rocket = Rocket(motor)

flight = Flight(rocket, env)

filename = "blablabla"

dispersion = Dispersion(filename)

disp_dictionary = {
    # Solid Motor Parameters
    "burnOutTime": 0.2,
    "totalImpulse": 0.033 * motor.totalImpulse,
    "motor_position": (-1.255, 0),
    # Rocket Parameters
    "mass": 0.100,
    "radius": 0.001,
    "powerOffDrag": 0.033,  # Multiplier
    "powerOnDrag": 0.033,  # Multiplier
    "parachute_Main_CdS": 1,
    "parachute_Drogue_CdS": 0.1,
    "parachute_Main_lag": 0.1,
    "parachute_Drogue_lag": 0.1,
    # Flight Parameters
    "inclination": 1,
    "heading": 2,
}

dispersion.run(
    number_of_simulations=50,
    dispersion_dictionary=disp_dictionary,
    flight=flight,
    append=False,
)
  • Pros: It is what we already have done in this PR, and it is working. In 80% of the cases, we are using normal distributions only.
  • Cons: Doesn't support different distribution function, which can be a complication if the user desires to use normal and uniform distributions for different parameters. Also, it currently includes in the flight_setting generator more than what is needed to be varied, for instance, the generator is currently losing time providing the

Proposed framework:

import numpy as np
from datetime import datetime
from rocketpy import Environment, Rocket, SolidMotor, Flight, Dispersion, Variables

env = Environment()

motor = SolidMotor()

rocket = Rocket(motor)

flight = Flight(rocket, env)

filename = "blablabla"

dispersion = Dispersion(filename)

gravity = Variables(
    name="gravity",
    dist_func=np.random.normal,
    dist_kargs={"loc": 9.8, "scale": 0.005},
)

powerOffDragFactor = Variables(
    name="powerOffDragFactor",
    dist_func=np.random.normal,
    dist_kargs={"loc": 1, "scale": 0.033},
)

mass = Variables(
    name="mass",
    dist_func=np.random.normal,
    dist_kargs={"loc": 20, "scale": 0.005},
)

inclination = Variables(
    name="inclination",
    dist_func=np.random.normal,
    dist_kargs={"loc": 85, "scale": 0.005},
)

fixed = {
    "env": {
        "date": datetime(2019, 1, 1, 12, 0, 0),
        "latitude": 29.7604,
        "longitude": -95.3698,
        "datum": "WGS84",
    },
    "motor": {
        "...": "...",
        "thrust": "thrust_curve.csv",
    },
    "rocket": {"...": "..."},
    "parachute": {"...": "..."},
    "flight": {"...": "..."},
}

variables = {
    "env": {
        "gravity": gravity,
    },
    "motor": {
        "...": "...",
    },
    "rocket": {
        "...": "...",
        "mass": mass,
        "powerOffDragFactor": powerOffDragFactor,
    },
    "parachute": {"...": "..."},
    "flight": {
        "...": "...",
        "inclination": inclination,
    },
}

dispersion.run(
    number_of_simulations=50,
    fixed_args=fixed,
    variable_args=variables,
    flight=flight,
    environment=None,
    rocket=None,
    append=False,
)

  • Pros: Flexible to use any distribution from np.random in each variable. Separate the constant (fixed) values from the variables, reducing the time for looping. It is easier to explain to new users how to create the different required dictionaries, rather than having only one "dispersion_dict" argument with different contents. The Variables class is created to abstract the definition of all the input parameters.
  • Cons: More work to be done here. The setup of a dispersion simulation can take more lines, as you can notice by the above examples.

Last comment to try helping your review: Please imagine 2 different dispersion scenarios: 1 with just 2 or 3 parameters being varied, and other with all the parameters being varied. The solution should work well in both cases, but especially in the first one.

Gui-FernandesBR avatar Jan 31 '23 02:01 Gui-FernandesBR

Here is how I imaged the Variables class. Let's discuss by voice before commiting or refusing anything, please!

class Variables:
    """Class to define variables for Monte Carlo simulations. Each variable is
    defined by a distribution function and its parameters. Distributions are
    defined by the numpy.random module. To see the available distributions,
    please refer to the numpy documentation. #TODO: Add link.
    Once the variables are defined, they can be used to generate random numbers
    using the __call__ method.

    Parameters
    ----------
    Variables.name : str
        The name of the variable.
    Variables.dist_func : np.random._function_
        The distribution function to be used.
    Variables.dist_kargs : dict
        The parameters of the distribution function. To see the available
        parameters, please refer to the numpy documentation. #TODO: Add link

    Returns
    -------
    None

    Examples
    --------
    >>> from rocketpy import Variables
    >>> var = Variables('name', np.random.normal, {'loc': 0, 'scale': 1})
    >>> var() # Returns a random number considering the distribution

    """

    def __init__(self, name, dist_func, dist_kargs):
        """Initialize the object.

        Parameters
        ----------
        name : str
            The name of the variable.
        dist_func : np.random._function_
            The distribution function to be used.
        dist_kargs : dict
            The parameters of the distribution function. To see the available
            parameters, please refer to the numpy documentation. #TODO: Add link

        Returns
        -------
        None
        """
        self.name = name
        self.dist_func = dist_func
        self.dist_kargs = dist_kargs

    def __str__(self):
        return f"Variable {self.name} with distribution {self.dist_func} defined by {len(self.dist_kargs.keys())} parameters"

    def __call__(self):
        return self.dist_func(**self.dist_kargs)

Gui-FernandesBR avatar Jan 31 '23 02:01 Gui-FernandesBR

Now, regarding the task I've been working on, here is my progress, please let me know if they completely address the issues or not.

Improve export_flight_data. This includes 1 - Add all exportable attributes to self.exportable_list

Done! I simply added the most important parameters to self.standard_otputs

2 - Add a way to save more than just flight attributes

I understand your point, but I don't see any results from other classes that we could include in the outputs file. I mean:

  • Can you name any result from other classes that would be interesting to be included in outputs?
  • Environment, rocket and motor are also present at the flight.dict, if we need we can access them from that dict.
  • The input file already has all the input parameters used to define each of these objects.

3 - Make saving parachute events optional (currently they are always saved as long as there are parachutes)

Done! At least for the outputs, it is done. I just don't have a smart idea of how to remove the parachute info from inputs file without using another loop.

4 - Find a way to optimize this function by removing for loops or by not having it be run inside the dispersion loop (we are losing a lot of performance on this)

I've implemented a map function to do the job of the old "for key, value in dict.items()", and I think. Did it save us a huge amount of time? Well, no. But I think it helped. I also removed some unnecessary for loops anyway.

Gui-FernandesBR avatar Jan 31 '23 03:01 Gui-FernandesBR

More than 700 commits merged into this branch. Well, it wasn't easy... But it's done.

Step 1: Tests needs to be fixed + then we finish any adjustments regarding v1.0 Step 2: finish up the class methods Step 3: discuss implementation Step 4: final documentation, add more examples, merge and release

Gui-FernandesBR avatar Oct 29 '23 07:10 Gui-FernandesBR

Current status:

  • [x] Removing pydantic from dependencies: Environment, SolidMotor
  • [x] Removing pydantic from dependencies: Rocket, Aero surfaces.
  • [x] Add stocastic generic motors
  • [x] Tests (@Gui-FernandesBR )
  • [x] Documentation

Gui-FernandesBR avatar Dec 13 '23 22:12 Gui-FernandesBR

@MateusStano please let me know when I can start working on the new tests here :)

Gui-FernandesBR avatar Dec 17 '23 19:12 Gui-FernandesBR

Codecov Report

Attention: Patch coverage is 75.71090% with 205 lines in your changes are missing coverage. Please review.

Project coverage is 73.60%. Comparing base (64bdab2) to head (209434f). Report is 5 commits behind head on develop.

Files Patch % Lines
rocketpy/simulation/monte_carlo.py 65.45% 76 Missing :warning:
rocketpy/stochastic/stochastic_model.py 75.00% 28 Missing :warning:
rocketpy/stochastic/stochastic_rocket.py 85.32% 27 Missing :warning:
rocketpy/plots/monte_carlo_plots.py 61.40% 22 Missing :warning:
rocketpy/_encoders.py 35.29% 11 Missing :warning:
rocketpy/tools.py 86.74% 11 Missing :warning:
rocketpy/stochastic/stochastic_environment.py 75.00% 10 Missing :warning:
rocketpy/stochastic/stochastic_flight.py 62.96% 10 Missing :warning:
rocketpy/stochastic/stochastic_aero_surfaces.py 85.00% 6 Missing :warning:
rocketpy/stochastic/stochastic_parachute.py 91.66% 2 Missing :warning:
... and 2 more
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #232      +/-   ##
===========================================
+ Coverage    73.38%   73.60%   +0.22%     
===========================================
  Files           57       70      +13     
  Lines         9461    10310     +849     
===========================================
+ Hits          6943     7589     +646     
- Misses        2518     2721     +203     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Mar 05 '24 01:03 codecov[bot]

This PR introduces a new feature: bc73117

It is the private class named RocketPyEncoder(json.JSONEncoder)

Sorry for doing that, but we would need it eventually anyways.

I think the tests will pass now. There is a bug in the environment analysis test, unfortunately. But I wouldn't change it in here.

See: https://stackoverflow.com/questions/3768895/how-to-make-a-class-json-serializable

Gui-FernandesBR avatar Mar 15 '24 04:03 Gui-FernandesBR

Something that I'm not comfortable with: Wind Speed factors and the assumption they are normal distributed + the same uncertainty is applied to the whole wind profile.

Gui-FernandesBR avatar Mar 22 '24 16:03 Gui-FernandesBR

Hey, I think we should add a warning in the MonteCarlo initialization saying that the class is still too young and we may change its behavior in eventual future minor releases. After releasing and people start using, we can evaluate when to temove the warning.

We did the very same thing with the EnvironmentAnalysis class.

What do you think? @phmbressan @MateusStano @Lucas-Prates

Gui-FernandesBR avatar Mar 22 '24 18:03 Gui-FernandesBR

Hey, I think we should add a warning in the MonteCarlo initialization saying that the class is still too young and we may change its behavior in eventual future minor releases. After releasing and people start using, we can evaluate when to temove the warning.

We did the very same thing with the EnvironmentAnalysis class.

What do you think? @phmbressan @MateusStano @Lucas-Prates

Makes sense to me. This class probably will go through much feedback and, depending on what features we want to introduce, might create "breaking changes" on how the input is parsed and provided by the user. For instance, we currently only accept marginal distributions for the parameters. If, in the future, we want to implement joint distributions, I believe this would require another way of dealing with the input.

Lucas-Prates avatar Mar 22 '24 21:03 Lucas-Prates

While trying to refactor my PR to use the Monte Carlo class, one thing I found missing is to be able to export the trajectories. For the notebook mentioned, this information is not critical and I can omit its use. @MateusStano, @Gui-FernandesBR, @phmbressan, do you think this information would be important for future use of the class? Exporting the trajectories is a bit harder since it requires some "convention." What I did in my rough first implementation was to fix a sampling rate and sample all the trajectories at multiples of that time step from 0 to the t_final of the reference trajectory. In our case, this upper bound could be max(t_final_array) where t_final_array is an array of all t_final of all trajectories sampled. That would require, however, to store all flight objects until the end.

Lucas-Prates avatar Apr 03 '24 15:04 Lucas-Prates

While trying to refactor my PR to use the Monte Carlo class, one thing I found missing is to be able to export the trajectories. For the notebook mentioned, this information is not critical and I can omit its use. @MateusStano, @Gui-FernandesBR, @phmbressan, do you think this information would be important for future use of the class? Exporting the trajectories is a bit harder since it requires some "convention." What I did in my rough first implementation was to fix a sampling rate and sample all the trajectories at multiples of that time step from 0 to the t_final of the reference trajectory. In our case, this upper bound could be max(t_final_array) where t_final_array is an array of all t_final of all trajectories sampled. That would require, however, to store all flight objects until the end.

I believe this is something we do in the future, not so urgent. At a first glance, I think saving the trajectory positions coordinates may be too expensive.

Gui-FernandesBR avatar Apr 04 '24 04:04 Gui-FernandesBR

After 375 commits and more than 600 days, this PR is finally ready for a last review and merge.

Calling out all the reviewers @RocketPy-Team/code-owners so we can have a last double-check in this code before going to the develop branch.

Gui-FernandesBR avatar May 15 '24 14:05 Gui-FernandesBR

Hey! I noticed quite a lot of important documentation was missing so I took the liberty to change a lot of the usage notebook

It is still very far from being good I think, but I hope things are better explained now. It is very important that someone reviews what I wrote in there carefully, and improves on anything that needs it

There are still a couple of problems I found

  • When you write the name of a stochastic object in the last line of a cell in the notebook, it gets the __repr__ of the class, which is not very useful. I would say leaving the __repr__ as what now is the __str__ would be really important
  • The __str__ of a StochasticRocket with added motor, aerosurfaces, and/or parachutes is a mess right now
  • Still need to add an explanation on how to add an image to dispersion.plot.ellipses and how to translate that image in the plot
  • Need to add an explanation of how to set the export_list of the MonteCarlo class and how these are saved on the files. Also, still needs to add docs explaining what the default export_list is and what are the possible exportable variables
  • The export_ellipses_to_kml raises an error for me

MateusStano avatar May 16 '24 20:05 MateusStano

Hey! I noticed quite a lot of important documentation was missing so I took the liberty to change a lot of the usage notebook

It is still very far from being good I think, but I hope things are better explained now. It is very important that someone reviews what I wrote in there carefully, and improves on anything that needs it

https://www.productplan.com/glossary/minimum-viable-product/

There are still a couple of problems I found

  • When you write the name of a stochastic object in the last line of a cell in the notebook, it gets the __repr__ of the class, which is not very useful. I would say leaving the __repr__ as what now is the __str__ would be really important

I think I temporarily addressed this issue here: 521a8e8 Can be improve in future PRs.

  • The __str__ of a StochasticRocket with added motor, aerosurfaces, and/or parachutes is a mess right now The problem is that a __str__ ideally is not so complex as we have today. You can see as an example the Function class, which has a really simple "str". If you want so badly a method to visualize what is inside a StochasticModel, then we should probably add a method: "your_stochastic_object.visualize_variables()" or similar names.
  • Still need to add an explanation on how to add an image to dispersion.plot.ellipses and how to translate that image in the plot

Can be done in the future. I don't really know if we should allow the user to input an image or simply download an satellite image at the moment we run the function. Let it to the future.

  • Need to add an explanation of how to set the export_list of the MonteCarlo class and how these are saved on the files. Also, still needs to add docs explaining what the default export_list is and what are the possible exportable variables

This requires time and effort, let's leave it to the next PR of documentation.

  • The export_ellipses_to_kml raises an error for me

What error? How can we reproduce it? ... It is running flawlessly in my machine, including the tests. Also, this is an extra feature. Can be fixed in another PR.

Gui-FernandesBR avatar May 21 '24 04:05 Gui-FernandesBR

Ok, thank you for the review and the updates in the .ipynb file, @MateusStano .

As agreed, let's follow through this PR, approve and merge it. I'm working on documentation improvements and this will be addressed in a new PR.

Gui-FernandesBR avatar May 21 '24 04:05 Gui-FernandesBR

Awesome! This is finally ready to be merged. Probably the biggest PR we ever had, and we never make anything as big as this again heheheh

MateusStano avatar May 21 '24 20:05 MateusStano