stable-baselines3 issues

[Bug] `EpisodicLifeEnv.reset()` may raise `Monitor.step()` RuntimeError

2

### 🐛 Bug When `EpisodicLifeEnv` triggers a reset due to the end of lives, it takes a no-op action to "restart" the game. This no-op action may cause the actual...

luizapozzobon

bug

High reward in training and low reward in evaluation

3

### Question When I use stable baselines3 for my custom environment, I have found even though the reward in training is pretty high, the reward in the evaluation is low....

Entongsu

question

custom gym env

more information needed

[Feature Request] An option to collect rollout for n_episoded instead of n_steps

2

### 🚀 Feature An option to collect rollout for n_episodes instead of n_steps for on policy algorithms. ### Motivation Some environments, like games, have the most important reward at the...

CppMaster

enhancement

[Feature Request] Add logger.close to StopTrainingOnMaxEpisodes

2

### 🚀 Feature Add logger.close() to StopTrainingOnMaxEpisodes class. ### Motivation While I was working around with this amazing tool had some problems training models with large timesteps so the StopTrainingOnMaxEpisodes...

vcadillog

enhancement

[Feature Request] Add type checking to tests

3

### 🚀 Feature This issue is to discuss the possibility of including tests in the type checking CI pipeline. ### Motivation Currently, tests are not being type checked, which means...

Rocamonde

enhancement

[Feature Request] save multi best models in EvalCallback

5

### 🚀 Feature currently EvalCallback seems to only save one best_model. Add support for saving multiple best_models (say 5) ### Motivation In my own environment, from the eval curves in...

xibeisiber

enhancement

[Bug] Too general a return type for load and learn in BaseAlgorithm

4

### 🐛 Bug The return type of methods `.load()` and `.learn()` in `BaseAlgorithm` is annotated as `"BaseAlgorithm"`, which means that for any subclass that does not override the methods with...

Rocamonde

enhancement

Feat/dropq

3

## Description ## Motivation and Context - [ ] I have raised an issue to propose this change ([required](https://github.com/DLR-RM/stable-baselines3/blob/master/CONTRIBUTING.md) for new features and bug fixes) ## Types of changes -...

araffin

[Feature Request] Reducing memory consumption when using HerReplayBuffer

2

### 🚀 Feature - In `HerReplayBuffer`, initialize `self.self._buffer` considering dtype of each inputs ### Motivation In the implementation of `HerReplayBuffer`, `self.self._buffer` is initialized with zeros of `np.float32`, which may lead...

TMats

duplicate

enhancement

[Question] HER applied on GoalEnv with ObservationWrapper

5

### Question I am using stable-baselines3's implementation of HER with a custom environment, but I ran into problems in the reward computation step. The Gym environment is based on `GoalEnv`,...

ritalaezza

documentation

question

custom gym env

stable-baselines3
stable-baselines3 copied to clipboard

Metadata

[Bug] `EpisodicLifeEnv.reset()` may raise `Monitor.step()` RuntimeError

High reward in training and low reward in evaluation

[Feature Request] An option to collect rollout for n_episoded instead of n_steps

[Feature Request] Add logger.close to StopTrainingOnMaxEpisodes

[Feature Request] Add type checking to tests

[Feature Request] save multi best models in EvalCallback

[Bug] Too general a return type for load and learn in BaseAlgorithm

Feat/dropq

[Feature Request] Reducing memory consumption when using HerReplayBuffer

[Question] HER applied on GoalEnv with ObservationWrapper

← Metadata

Owner

Metadata

stable-baselines3 stable-baselines3 copied to clipboard

Metadata

← Metadata

Owner

Metadata

stable-baselines3
stable-baselines3 copied to clipboard