cleanrl icon indicating copy to clipboard operation
cleanrl copied to clipboard

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Results 115 cleanrl issues
Sort by recently updated
recently updated
newest added

## Description Closes #127 ## Types of changes - [ ] Bug fix - [ ] New feature - [ ] New algorithm - [x] Documentation ## Checklist: - [x]...

Details ## Problem Description Pytorch DQN fails on MountainCar. Try two settings in [the issue](https://github.com/vwxyzjn/cleanrl/issues/156) ## Checklist - [x] I have installed dependencies via `poetry install` (see [CleanRL's installation guideline](https://docs.cleanrl.dev/get-started/installation/)....

## Problem Description A much requested inclusion in the library is to add unit tests. Among other things, "the key benefit of unit tests is to make sure the logic...

help wanted

## Problem Description Hi I would like to add the double DQN algorithm to cleanrl. Can someone give me the go-ahead?

## Description JAX implementation for C51 Implementation for #221 ## Types of changes - [ ] Bug fix - [ ] New feature - [x] New algorithm - [ ]...

Hi, I'm a PhD student doing work in hierarchical reinforcement learning (specifically [Option-critic-based algorithms](https://arxiv.org/abs/1709.04571)), and I've found this repository to be a particularly helpful starting point when trying to prototype...

## Description Add implementation of PPO-DNA algorithm for Atari Envpool. ### Paper reproduction (attempt) Here's the episodic rewards after 200M environment steps (50M environment interactions before frame skip), compared to...

`rnd_ppo.py` is a bit dated, and I recommend refactoring it to match other PPO style, which would include: - [x] change the name from `rnd_ppo.py` to `ppo_rnd.py` - [x] use...

## Problem Description Pytorch recently announced a universal job launcher called torchx (https://pytorch.org/torchx/latest/), which supports launching jobs at AWS batch, docker, k8s, and more. We should adopt `torchx`, which perfectly...

enhancement

TRPO is a famous and powerful tool in RL. Although it does not have many practical uses these days, it is very helpful for a learner to read a good...

enhancement
help wanted