Costa Huang comments

Results 256 comments of


                                            Costa Huang

PPO + JAX + EnvPool + Atari

This PR is ready for review. Excited to push forward our efficient PPO implementation ![hms_each_game](https://user-images.githubusercontent.com/5555347/189559465-a5c5c97f-366a-4c52-8db1-90b2b9e13098.png)

PPO + JAX + EnvPool + Atari

Hi @yooceii @kinalmehta, I have addressed most of your concerns. Please let me know if additional tweaks are needed.

Jax c51 contrib

FYI dopamine has a [benchmark](https://google.github.io/dopamine/baselines/atari/plots.html), but its x-axis is not the environment steps... Any clue on how we can compare those results? @joaogui1

Adding Hierarchical RL Algorithms

Hi @DavidSlayback, I apologize for getting back to you so late. I am a little confused. There seems to be 4 algorithms in the hyperlinks. Which are the ones that...

Adding Hierarchical RL Algorithms

That makes sense. I’d suggest put a draft PR for better visibility but only if you feel more comfortable that way.

Removing the regular advantage calculation in PPO

Closed by #287

Implement PPO MPI (SB2 PPO1)

Hey @araffin I prototyped multi-GPU support with `torch.distributed` https://github.com/vwxyzjn/cleanrl/pull/162. Preliminary experiments seem successful when controlling torch thread number to 1 per process and use SyncVecEnv: `ppo_atari_multigpu_batch_reduce.py` was able to obtain...

Costa Huang

PPO + JAX + EnvPool + Atari

PPO + JAX + EnvPool + Atari

Jax c51 contrib

Adding Hierarchical RL Algorithms

Adding Hierarchical RL Algorithms

Removing the regular advantage calculation in PPO

Implement PPO MPI (SB2 PPO1)

How to use TKE with CLB

[BUG] Incompatible with latest gym normalize wrappers

Making gym_unity a standard gym package