POPLIN
POPLIN copied to clipboard
question on pendulum reward function
Can you explain why this reward function (-cos(theta)-0.1*sin(theta) ... ) is used for pendulum?
https://github.com/WilsonWangTHU/POPLIN/blob/edd8dba50f9049c6164eda774602bef0c299cb51/dmbrl/config/gym_pendulum.py#L104
And why does it need to be different from the original reward function from openai-gym?