Clodéric Mars

Results 23 comments of Clodéric Mars

- Convergence problem on PPO is fixed - Still a computation performance problem to be fixed - TODO -> - Multiple environment single agent - Multiple environment multiple agent

Needs to be rebased @saikrishna-1996 to do it.

- Forward part is working - TODO -> Mege request and test on other environments.

Maybe investigate https://hydra.cc

## What's done - Draft done - Refactoring on reinforce ongoing to clean up the code ## Todo - Finalize doc and do PR.

We should include methods that uses a continuous action space

- (Ongoing) Human driven exploration with µ0 Unplugged - IL / BC -> Basic Behavior Cloning / GAIL / [Primal Wasserstein Imitation Learning](https://openreview.net/forum?id=TtYSU29zgR) / [Domain-Robust Visual Imitation Learning with Mutual...

Thank you for your interest, we will take a look at your suggestion and update the documentation.

e.g. - Every N trials, - Archive the latest model version. - Run a specific set of trial on this version just for validation (no replay buffer update, no new...