NAF-tensorflow
NAF-tensorflow copied to clipboard
Reproducing results of the paper on Mujoco domain
Working on paper branch (link).
| environment | Best return for 200 steps |
|---|---|
| InvertedPendulum-v1 | |
| InvertedDoublePendulum-v1 | |
| Reacher-v1 | |
| HalfCheetah-v1 | 100 |
| Swimmer-v1 | |
| Hopper-v1 | |
| Walker2d-v1 | |
| Ant-v1 | |
| Humanoid-v1 | |
| HumanoidStandup-v1 |
HalfCheetah-v1 is trainable with checkpoints/env_name=HalfCheetah-v1/action_fn=tanh/action_w=uniform_big/batch_size=100/clip_action=False/discount=0.99/hidden_dims=[200,200]/hidden_fn=tanh/hidden_w=uniform_big/learning_rate=0.0001/max_episodes=10000/max_steps=150/noise=ou/noise_scale=0.3/tau=0.001/update_repeat=5/use_batch_norm=False/use_seperate_networks=False/w_reg=none/w_reg_scale=0.001
hi @carpedm20 thank for your great implementation, but I wonder if there's any other results for Mujoco benchmark
Sorry but I didn't test this on Mujoco and I don't have any plan for this project.