stable-baselines3
stable-baselines3 copied to clipboard
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
### ๐ Bug I was training and used VecVideoRecorder to save some episodes. I saw that RAM usage kept increasing until crashing. I tried testing it to see what's happening...
slightly increases numerical precision and speed ideally, we'd put this into a torch.compile-d block, but it's not clear whether SB3 wants to support/use compile.
### ๐ Bug I have a problem that involves the use of graphs in the observation space and I'd like to use pytorch geometric to train a GNN for my...
### ๐ Bug Cannot load a saved compiled model. Issue linked to: https://github.com/DLR-RM/stable-baselines3/issues/1438 ### To Reproduce ```python import gym import torch as th from stable_baselines3 import PPO env = gym.make("Pendulum-v1")...
## Description Added an optional `BatchNorm` integration to the `NatureCNN` architecture used in the feature extractor module of Stable-Baselines3. This enhancement introduces a `use_batch_norm` flag to toggle Batch Normalization after...
### ๐ Feature Optional BatchNorm integration in NatureCNN ### Motivation Motivation Batch Normalization helps stabilize and accelerate training by reducing internal covariate shift, which is especially important in high-variance pixelโbased...
### ๐ Bug ## Description When using `SubprocVecEnv` with multiple environments, all subprocesses ignore the GPU device specified for the main process and default to GPU 0, regardless of which...
Implement Risk Sensitive PPO algorithm which optimizes exponential criterion
## Description ``` python -m timeit -s "import numpy as np; a=np.ones(4096)" "a.nonzero()[0]" ``` ``` 10000 loops, best of 5: 12.4 usec per loop ``` vs ``` python -m timeit...
### ๐ Bug MRC ๐ ```python from stable_baselines3.common.preprocessing import is_image_space import gymnasium as gym import numpy as np from gymnasium.wrappers import FrameStackObservation image_space = gym.spaces.Box(0, 255, (3, 64, 64), np.uint8)...