stable-baselines3-contrib
stable-baselines3-contrib copied to clipboard
Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code
IQN
## Description ## Context - [ ] I have raised an issue to propose this change ([required](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/CONTRIBUTING.md)) ## Types of changes - [ ] Bug fix (non-breaking change which fixes...
**Describe the bug** I've been trying to troubleshoot why my MPPO training is very slow (50it/s) when PPO on breakout is ~750it/s. The speed degredation only happens with MPPO when...
custom wrapper around the environment that tracks when invalid actions are masked and modifies the information sent to the agent accordingly
## Description ## Context - [ ] I have raised an issue to propose this change ([required](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/CONTRIBUTING.md)) ## Types of changes - [ ] Bug fix (non-breaking change which fixes...
**Important Note: We do not do technical support, nor consulting** and don't answer personal questions per email. Please post your question on [reddit](https://www.reddit.com/r/reinforcementlearning/) or [stack overflow](https://stackoverflow.com/) in that case. If...
DuelingDQN
## Description ## TODOs - [ ] Make learning curves - [ ] Complete DuelingDQN short presentation - [ ] Fill the result tabular - [ ] Results validation against...
Hi, we developed and tested our algorithm OT-TRPO (published at the upcoming NeurIPS2022, you can find the preprint [here](https://arxiv.org/abs/2210.11137)) using stable baselines. Is there an interest in integrating it with...
## Description Handles the breaking changes from https://github.com/DLR-RM/stable-baselines3/pull/1190 ## Context - [ ] I have raised an issue to propose this change ([required](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/CONTRIBUTING.md)) ## Types of changes - [ ]...
### ❓ Question Hello, I am making a lot of test for finRL and using LSTM/GRU into an ActorCritic Policy was one of my first idea but I saw now...
### ❓ Question 🙏Thanks for the high scalability made by sb3-contrib I am referring to the MaskablePPO method to add a mask to TRPO. And In the train function of...