Sridhar Thiagarajan comments

Repositories
Issues
Comments

Results 3 comments of


Sridhar Thiagarajan

How to make expert data

Just load a pre-trained policy, and do env.step(), and save all the states action pairs obtained?

DDPG implementation fails to learn well on at least five MuJoCo-v2 envs for all three noise types. I report steps to reproduce and learning curve plots [and show that PPO2 seems to work fine].

As a side note, in case you're interested in quickly trying something out before this issue gets resolved, I would highly recommend the TD3 author's official implementation (which is in...

DDPG implementation fails to learn well on at least five MuJoCo-v2 envs for all three noise types. I report steps to reproduce and learning curve plots [and show that PPO2 seems to work fine].

@DanielTakeshi Did you run any of these benchmarks on vision-based tasks, or know of any results?