stable-baselines3
stable-baselines3 copied to clipboard
Plotting Documentation
📚 Documentation
The plotting module is currently not documented. We should add an example in the doc + probably link to the zoo for more advanced examples. We need to document:
- [ ] plotting training reward/success (using monitor file)
- [ ] plotting evaluation reward/success (maybe just a link to the zoo)
### Checklist
- [x] I have read the documentation (required)
- [x] I have checked that there is no similar issue in the repo (required)
@araffin A single training run of an RL agent generates several monitor files (in my case, 3). Why is that? In other words, what does stable_baselines3.common.monitor.load_results do with them? I can't figure this out just by reading code.
Also, not sure what you mean by "The plotting module is currently not documented." As you mentioned in rl-baselines3-zoo's README, there are examples given to reproduce each algorithm's performance in SB3's doc?
generates several monitor files (in my case, 3). Why is that? I
That's normal, one per environment to avoid any race condition (as in https://github.com/DLR-RM/stable-baselines3/issues/647).
In other words, what does stable_baselines3.common.monitor.load_results do with them?
it should load and re-order all of them (using the time): https://github.com/DLR-RM/stable-baselines3/blob/e75e1de4c127747527befc131d143361eddddae3/stable_baselines3/common/monitor.py#L235
Also, not sure what you mean by "The plotting module is currently not documented.
There is currently no documentation on how to use this https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/common/results_plotter.py
As you mentioned in rl-baselines3-zoo's README, there are examples given to reproduce each algorithm's performance in SB3's doc?
yes, most of the documentation will be referring to the zoo in fact...
Hi @qgallouedec. Can I start with this? Can you provide some help regarding this
Can I start with this? Can you provide some help regarding this
yes please =). Best is to take a look at https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/rl_zoo3/plots/plot_train.py for the training plots. We could also document how to group multiple runs (depending how complex it is, I think seaborn might be able to do that for us).
For plotting evaluation, you can also take a look at https://github.com/DLR-RM/rl-baselines3-zoo/tree/master/rl_zoo3/plots but the code is much more complex to handle many cases, so I would just link to the RL Zoo in that case.
@theSquaredError are you still working on this?
@araffin Hello sir, I was busy with something. I will soon start this.
@theSquaredError Are you still planning to work on this issue? Otherwise, I would like to have a look at it :)
you can look. I won't be able to work on this right now
Hey,
I was looking into that topic and it is not entirely clear to me what exactly is to be used for what. I'll summarize it before I start working on the documentation itself.
1. The results_plotter
The results_plotter (https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/common/results_plotter.py) can be used to create plots based on the monitor files created while training.
2. The RL Zoo Train Plot Script
The RL Zoo train plot script (https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/rl_zoo3/plots/plot_train.py) does a similar thing to the result_plotter, but in RL Zoo. It uses some functions of the result_plotter and implements some functionalities itself.
3. The RL Zoo Evaluation Scripts
The RL Zoo evaluation scripts are contained in (https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/rl_zoo3/plots/all_plots.py and https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/rl_zoo3/plots/plot_from_file.py). These scripts can be used for plotting evaluations.
As I have understood it so far, I would describe the use of the results_plotter for training plots in the SB3 docs using an example here and simply refer to the RL Zoo documentation for the evaluation plots.
Hello,
I'll summarize it before I start working on the documentation itself.
thanks =)
your description is right:
- the
result_plotteris mostly interesting for itsts2xyandwindow_func, it is meant to be used as a basic bloc for plotting training curve (not evaluation ones) - the rl zoo train plot allows to quickly visualize training reward/success, it is not meant for reporting, and builds around the
result_plotter - the rl zoo
all_plotsandplot_from_fileare the two pieces that are meant for aggregating and reporting results, they have many features but the code need refactoring, and one usually need to adapt it to his/her need. They are the recommended plotting script for plotting/comparing things (in addition to plots that are available via W&B)