stable-baselines3-contrib issues

IQN

4

## Description ## Context - [ ] I have raised an issue to propose this change ([required](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/CONTRIBUTING.md)) ## Types of changes - [ ] Bug fix (non-breaking change which fixes...

qgallouedec

Possible issue with Maskable PPO

4

**Describe the bug** I've been trying to troubleshoot why my MPPO training is very slow (50it/s) when PPO on breakout is ~750it/s. The speed degredation only happens with MPPO when...

emrul

Custom Environment

6

custom wrapper around the environment that tracks when invalid actions are masked and modifies the information sent to the agent accordingly

Zaibali9999

more information needed

RTFM

custom gym env

timestamp to episode in documentation

## Description ## Context - [ ] I have raised an issue to propose this change ([required](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/CONTRIBUTING.md)) ## Types of changes - [ ] Bug fix (non-breaking change which fixes...

theSquaredError

trying to mask actions for an environment with dict observation and multidiscrete action space

4

**Important Note: We do not do technical support, nor consulting** and don't answer personal questions per email. Please post your question on [reddit](https://www.reddit.com/r/reinforcementlearning/) or [stack overflow](https://stackoverflow.com/) in that case. If...

zbenmo

DuelingDQN

10

## Description ## TODOs - [ ] Make learning curves - [ ] Complete DuelingDQN short presentation - [ ] Fill the result tabular - [ ] Results validation against...

qgallouedec

[Feature request] Implement OT-TRPO

3

Hi, we developed and tested our algorithm OT-TRPO (published at the upcoming NeurIPS2022, you can find the preprint [here](https://arxiv.org/abs/2210.11137)) using stable baselines. Is there an interest in integrating it with...

antonioterpin

enhancement

Fix `stable_baselines3/common/distributions.py` type hint

## Description Handles the breaking changes from https://github.com/DLR-RM/stable-baselines3/pull/1190 ## Context - [ ] I have raised an issue to propose this change ([required](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/CONTRIBUTING.md)) ## Types of changes - [ ]...

qgallouedec

[Question] Recurrent Maskable PPO ?!? Rudder ?!?

### ❓ Question Hello, I am making a lot of test for finRL and using LSTM/GRU into an ActorCritic Policy was one of my first idea but I saw now...

tty666

question

[Question] What is the difference between old_distribution and distribution in train function of TRPO

1

### ❓ Question 🙏Thanks for the high scalability made by sb3-contrib I am referring to the MaskablePPO method to add a mask to TRPO. And In the train function of...

0Addicted0

question

stable-baselines3-contrib
stable-baselines3-contrib copied to clipboard

Metadata

IQN

Possible issue with Maskable PPO

Custom Environment

timestamp to episode in documentation

trying to mask actions for an environment with dict observation and multidiscrete action space

DuelingDQN

[Feature request] Implement OT-TRPO

Fix `stable_baselines3/common/distributions.py` type hint

[Question] Recurrent Maskable PPO ?!? Rudder ?!?

[Question] What is the difference between old_distribution and distribution in train function of TRPO

← Metadata

Owner

Metadata

stable-baselines3-contrib stable-baselines3-contrib copied to clipboard

Metadata

← Metadata

Owner

Metadata

stable-baselines3-contrib
stable-baselines3-contrib copied to clipboard