cleanrl
cleanrl copied to clipboard
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
## Description Closes #127 ## Types of changes - [ ] Bug fix - [ ] New feature - [ ] New algorithm - [x] Documentation ## Checklist: - [x]...
Details ## Problem Description Pytorch DQN fails on MountainCar. Try two settings in [the issue](https://github.com/vwxyzjn/cleanrl/issues/156) ## Checklist - [x] I have installed dependencies via `poetry install` (see [CleanRL's installation guideline](https://docs.cleanrl.dev/get-started/installation/)....
## Problem Description A much requested inclusion in the library is to add unit tests. Among other things, "the key benefit of unit tests is to make sure the logic...
## Problem Description Hi I would like to add the double DQN algorithm to cleanrl. Can someone give me the go-ahead?
## Description JAX implementation for C51 Implementation for #221 ## Types of changes - [ ] Bug fix - [ ] New feature - [x] New algorithm - [ ]...
Hi, I'm a PhD student doing work in hierarchical reinforcement learning (specifically [Option-critic-based algorithms](https://arxiv.org/abs/1709.04571)), and I've found this repository to be a particularly helpful starting point when trying to prototype...
## Description Add implementation of PPO-DNA algorithm for Atari Envpool. ### Paper reproduction (attempt) Here's the episodic rewards after 200M environment steps (50M environment interactions before frame skip), compared to...
`rnd_ppo.py` is a bit dated, and I recommend refactoring it to match other PPO style, which would include: - [x] change the name from `rnd_ppo.py` to `ppo_rnd.py` - [x] use...
## Problem Description Pytorch recently announced a universal job launcher called torchx (https://pytorch.org/torchx/latest/), which supports launching jobs at AWS batch, docker, k8s, and more. We should adopt `torchx`, which perfectly...
TRPO is a famous and powerful tool in RL. Although it does not have many practical uses these days, it is very helpful for a learner to read a good...