stable-baselines3
stable-baselines3 copied to clipboard
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
### ❓ Question The following code samples action for an off-policy algorithm. As the comments indicate, the continuous actions obtained in line 395 should have already been scaled by tanh,...
Co-authored-by: Riccardo Sepe Co-authored-by: Francesco Scalera Added missing metrics when logging on tensorboard (#1298) ## Description Now both the hparam_dict and the metric_dict are stored on Tensorboard ## Motivation and...
### ❓ Question How do I export RecurrentPPO as an onnx model? ### Checklist - [X] I have checked that there is no similar [issue](https://github.com/DLR-RM/stable-baselines3/issues) in the repo - [X]...
## Description After https://github.com/DLR-RM/rl-baselines3-zoo/pull/355#issuecomment-1425749593 ## Motivation and Context - [ ] I have raised an issue to propose this change ([required](https://github.com/DLR-RM/stable-baselines3/blob/master/CONTRIBUTING.md) for new features and bug fixes) ## Types of...
## Description - Added next_observations field and has_next_observation mask to type_aliases.py - Extended generators in buffers.py to return next_observation and has_next_observation - Added a test in test_buffers.py ## Motivation and...
### 🚀 Feature Support graph style data structure as the observation and action spaces for RL algorithms like PPO or others. ### Motivation After [version 0.25.0](https://github.com/openai/gym/releases/tag/0.25.0), [gym](https://github.com/openai/gym) has support [graph](https://www.gymlibrary.dev/api/spaces/#graph)...
### ❓ Question I want to train my environment on multiple volumes for that i am using a for loop ,and changing the image in the environment ``` from stable_baselines3...
### 🚀 Feature Add `next_observations` and `dones` fields to the `RolloutBuffer` and the `DictRolloutBuffer` classes, similar to how it is done in the `ReplayBuffer` class. ### Motivation Currently, on-policy algorithms...
## Description This PR adds the `next_observations` and `dones` fields to the `RolloutBuffer` and the `DictRolloutBuffer` classes. The `OnPolicyAlgorithm` class is also changed to store both these fields. Closes #1273....
### 🚀 Feature Check the environment when creating a `VecEnv` ### Motivation I noticed that [`check_env`](https://github.com/DLR-RM/stable-baselines3/blob/2bb8ef5e632a0e0dda291c2cd6735da75a4fcb7e/stable_baselines3/common/env_checker.py#L319) doesn't work with `VecEnv`'s (#653), but I think it would be a good idea...