Costa Huang
Costa Huang
This PR is ready for review. Excited to push forward our efficient PPO implementation 
Hi @yooceii @kinalmehta, I have addressed most of your concerns. Please let me know if additional tweaks are needed.
FYI dopamine has a [benchmark](https://google.github.io/dopamine/baselines/atari/plots.html), but its x-axis is not the environment steps... Any clue on how we can compare those results? @joaogui1
Hi @DavidSlayback, I apologize for getting back to you so late. I am a little confused. There seems to be 4 algorithms in the hyperlinks. Which are the ones that...
That makes sense. I’d suggest put a draft PR for better visibility but only if you feel more comfortable that way.
Closed by #287
Hey @araffin I prototyped multi-GPU support with `torch.distributed` https://github.com/vwxyzjn/cleanrl/pull/162. Preliminary experiments seem successful when controlling torch thread number to 1 per process and use SyncVecEnv: `ppo_atari_multigpu_batch_reduce.py` was able to obtain...
I noticed the being-deprecated `tencentcloud_kubernetes_as_scaling_group` has a place to speficy `forward_balancer_ids`, but the new `tencentcloud_kubernetes_node_pool` does not seem to have these `forward_balancer_ids`.
I think the reason of the bug is https://github.com/openai/gym/issues/3021#issuecomment-1212656372. They expect a `_TimeLimit.truncated` to indicate if the key actually exists.
@yijiezh btw the `Monitor` issue is also mentioned in #3954 :)