Costa Huang

Results 256 comments of Costa Huang

Hi @iamunr4v31, feel free to do it. Thanks for considering making a contribution. Please check out our [contribution guide](https://github.com/vwxyzjn/cleanrl/blob/master/CONTRIBUTING.md) for the usual process. The main things I am looking for...

Hi David, thanks for considering making a contribution. We would definitely be interested in having HRL algorithms. Please check out our [contribution guide](https://github.com/vwxyzjn/cleanrl/blob/master/CONTRIBUTING.md) The main things I am looking for...

CC @kinalmehta, who is working on DIAYN #267.

@maitchison has expressed interest in helping review this PR. Thank you, Matthew! I will also try to read the paper and add some comments.

> PPO-DNA consistently uses 128 parallel envs This kind of changes things a lot. I am more inclined to reproduce this work similar to how the paper is done. ~~Please...

Starting a new thread for discussion. @jseppanen, thanks for bearing with me. The results are definitely convincing, but the main issue is consistency (same configuration, same steps, non-crashed runs). Totally...

Oh wow, this is really nice! How long did the experiment take?

Oh wow that’s taking a really long time. I think given the insane amount of computing required, running it for three random seeds might not be necessary…

Hi, @merak0514 thank you for raising this issue. Indeed it would be good to have TRPO as well. I personally won't have much time doing this but you and others...

@dosssma, @yooceii, @Dipamc77 would you mind giving this a try? See https://cleanrl-jlu83xh5n-vwxyzjn.vercel.app/advanced/hyperparameter-tuning/ for the current tutorial. Would love to hear your feedback.