evolution-strategies-starter
evolution-strategies-starter copied to clipboard
Why design ac_noise?
hi, Thanks for code sharing! after reading the source code, l have such questions, could you help me better understand the code?
- why design ac_noise? rather than deterministic action from output of policy network?
- more json on mujoco tasks, beyond huamanoid-v1
Thanks!
Hello, I am also a newbie to RL community. But for your question, I have some ideas:
- I think this can make the trained model more robust as this make sure that it can work in the environment that is not ideal like Mojoku. Also, perhaps the Mojoku itself has some scholastic difference in different episodes, so this will make the algorithm robust across different episodes.
- I guess you may add it by your self as it is not so difficult.