stable-baselines3 icon indicating copy to clipboard operation
stable-baselines3 copied to clipboard

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Results 192 stable-baselines3 issues
Sort by recently updated
recently updated
newest added
trafficstars

### ๐Ÿ› Bug I was training and used VecVideoRecorder to save some episodes. I saw that RAM usage kept increasing until crashing. I tried testing it to see what's happening...

question

slightly increases numerical precision and speed ideally, we'd put this into a torch.compile-d block, but it's not clear whether SB3 wants to support/use compile.

### ๐Ÿ› Bug I have a problem that involves the use of graphs in the observation space and I'd like to use pytorch geometric to train a GNN for my...

documentation
duplicate
help wanted
custom gym env

### ๐Ÿ› Bug Cannot load a saved compiled model. Issue linked to: https://github.com/DLR-RM/stable-baselines3/issues/1438 ### To Reproduce ```python import gym import torch as th from stable_baselines3 import PPO env = gym.make("Pendulum-v1")...

bug
documentation
help wanted

## Description Added an optional `BatchNorm` integration to the `NatureCNN` architecture used in the feature extractor module of Stable-Baselines3. This enhancement introduces a `use_batch_norm` flag to toggle Batch Normalization after...

### ๐Ÿš€ Feature Optional BatchNorm integration in NatureCNN ### Motivation Motivation Batch Normalization helps stabilize and accelerate training by reducing internal covariate shift, which is especially important in high-variance pixelโ€based...

enhancement

### ๐Ÿ› Bug ## Description When using `SubprocVecEnv` with multiple environments, all subprocesses ignore the GPU device specified for the main process and default to GPU 0, regardless of which...

custom gym env
openai gym

Implement Risk Sensitive PPO algorithm which optimizes exponential criterion

PR template not filled

## Description ``` python -m timeit -s "import numpy as np; a=np.ones(4096)" "a.nonzero()[0]" ``` ``` 10000 loops, best of 5: 12.4 usec per loop ``` vs ``` python -m timeit...

### ๐Ÿ› Bug MRC ๐Ÿ‘‡ ```python from stable_baselines3.common.preprocessing import is_image_space import gymnasium as gym import numpy as np from gymnasium.wrappers import FrameStackObservation image_space = gym.spaces.Box(0, 255, (3, 64, 64), np.uint8)...

bug
documentation
help wanted