stable-baselines3 issues

[Feature Request] Discounted Return value in policy evaluation

3

### 🚀 Feature Enable the user to save discounted return instead of accumulated reward in policy evaluation. ### Motivation Currently, EvalCallback and policy evaluation provide the user with accumulated rewards...

guyazran

enhancement

[Question] Loading a trained PPO model in PyTorch without any SB3 dependencies

I created a custom gym environment and successfully trained a model using PPO. I then saved it using the model.save() method and got back a zip file (as per the...

SarvagyaVaish

question

Update Gymnasium to v1.0.0

20

This PR updates SB3 to Gymnasium v1.0, read the [release-notes](https://github.com/Farama-Foundation/Gymnasium/releases/tag/v1.0.0) to see all the changes. ## Motivation and Context Gymnasium is the core API used in SB3, therefore would be...

pseudo-rnd-thoughts

[Feature Request] Implementation of Lattice exploration (Chiappa et al., NeurIPS 2023)

3

### 🚀 Feature I propose to include in Stable Baselines 3 an option to use **Lattice** exploration, an action noise that some colleagues and I have presented in [this](https://arxiv.org/abs/2305.20065) NeurIPS...

albertochiappa

enhancement

When using custom environments in a loop that trains multiple models, CUDA VRAM is never deallocated when iteration is done.

3

### 🐛 Bug SB3 models never deallocate the VRAM they use in a process even if they are deleted. This can lead to hardware issues including full system crashes that...

john-woolley

custom gym env

PPO doesn't work with MultiDiscrete observation space

6

### 🐛 Bug I am implementing a simple custom environment for using PPO with MultiDiscrete observation space. It works if I use MultiDiscrete([ 5, 2, 2 ]), but when it...

elisavio

documentation

help wanted

custom gym env

Documentation/Implementation mismatch in sum_independent_dims function

2

### 📚 Documentation I've noticed a potential mismatch between the implementation and the documentation of the `sum_independent_dims` function in `stable_baselines3.common.distributions`. According to its docstring, the function is designed to handle...

RolandStolz

documentation

help wanted

[Feature Request] Saving rendered trajectories in evaluations

1

### 🚀 Feature 1. Save rendered images in ```evaluate_policy()```. 2. Save rendered images in ```EvalCallback()```. ### Motivation In a headless server, the simplest way to examine the **behavior** of a...

Roadsong

enhancement

Deadlock when running model.learn on a SubprocVecEnv

4

### 🐛 Bug When running model.learn on a SubprocVecEnv as follows: ` env = make_vec_env(ENV_ID, n_envs=cpus, vec_env_cls=SubprocVecEnv, vec_env_kwargs=dict(start_method="spawn")) model = SAC(**kwargs, env=env) model.learn(N_TIMESTEPS, callback=eval_callback) ` the program ends up in...

1-Bart-1

custom gym env

check the checklist

[Bug]: Iteration not updated in locals while learning

1

### 🐛 Bug In the method `stable_baselines3.common.on_policy_algorithm.OnPolicyAlgorithm.learn` the `iteration` value is not updated in the `locals` dictionary while using callbacks. ### To Reproduce ```python from stable_baselines3 import PPO def callback_function(v_locals,...

ericrwp

bug

stable-baselines3
stable-baselines3 copied to clipboard

Metadata

[Feature Request] Discounted Return value in policy evaluation

[Question] Loading a trained PPO model in PyTorch without any SB3 dependencies

Update Gymnasium to v1.0.0

[Feature Request] Implementation of Lattice exploration (Chiappa et al., NeurIPS 2023)

When using custom environments in a loop that trains multiple models, CUDA VRAM is never deallocated when iteration is done.

PPO doesn't work with MultiDiscrete observation space

Documentation/Implementation mismatch in sum_independent_dims function

[Feature Request] Saving rendered trajectories in evaluations

Deadlock when running model.learn on a SubprocVecEnv

[Bug]: Iteration not updated in locals while learning

← Metadata

Owner

Metadata

stable-baselines3 stable-baselines3 copied to clipboard

Metadata

← Metadata

Owner

Metadata

stable-baselines3
stable-baselines3 copied to clipboard