stable-baselines3
stable-baselines3 copied to clipboard
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
### 🐛 Bug Given a wrapped env, options passed with the [recommended way](https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html) (`wrapped_env.set_options`) are ignored when reset is triggered by episode termination / truncation in `step_wait` of the env...
### 🐛 Bug I have a `spaces.Discrete(19, 2)` observation as part of my observation. The [documentation](https://gymnasium.farama.org/api/spaces/fundamental/#gymnasium.spaces.Discrete) for the Discrete space lists the space of possible values as {a, a+1, ...,...
Currently the SB3 tensorboard writer only supports torch.Tensor as a histogram value. However, the SummaryWriter actually also allows np.ndarray as a value. This PR enables this. ## Motivation and Context...
Attempt to fix #1770 in a fully backward compatible manner. ## Description In PPO, the optimizer in the policy is created before the computation-device for the class is correctly set....
### ❓ Question Many thanks for the great library! I have been trying out gSDE lately, which seems to be working well for my problem, but I have found that...
Hi, thanks a lot for the well-documented stable baselines3. Now I am using Isaac Gym Preview4. May I ask if it is possible to give some examples to wrap IsaacGymEnvs...
## Description - Removed `log_interval` argument from `collect_rollouts` in `OffPolicyAlgorithm` - Added logging in `learn` of `OffPolicyAlgorithm` instead - Fixed documentation of the functionality of `log_interval` - Swapped logging and...
### 🚀 Feature It would be nice to have a wrapper that ingested gymnasium.vector.VectorEnv and gave back a VecEnv. ### Motivation I want to do highly parallelized hardware accelerated simulation....
### ❓ Question I think I do not underestand the memory usage of SB3. I have a Dict observation space of some huge matrixes, so my observation space is 17MB...
### 🚀 Feature Graph feature extractor for the new observation space that gymnassium provides. ### Motivation I've been using a Graph2Vec aproximation using GraphTransformers and CLS tokens on a non-publishable...