evolution-strategies-starter Why design ac

Why design ac_noise?

Open fiberleif opened this issue 6 years ago • 1 comments

hi, Thanks for code sharing! after reading the source code, l have such questions, could you help me better understand the code?

why design ac_noise? rather than deterministic action from output of policy network?
more json on mujoco tasks, beyond huamanoid-v1

Thanks!

Jul 04 '18 03:07 fiberleif

Hello, I am also a newbie to RL community. But for your question, I have some ideas:

I think this can make the trained model more robust as this make sure that it can work in the environment that is not ideal like Mojoku. Also, perhaps the Mojoku itself has some scholastic difference in different episodes, so this will make the algorithm robust across different episodes.
I guess you may add it by your self as it is not so difficult.

Mar 26 '19 14:03 edydfang