POPLIN
POPLIN copied to clipboard
Bumps [tensorflow-gpu](https://github.com/tensorflow/tensorflow) from 1.9.0 to 2.12.0. Release notes Sourced from tensorflow-gpu's releases. TensorFlow 2.12.0 Release 2.12.0 TensorFlow Breaking Changes Build, Compilation and Packaging Removed redundant packages tensorflow-gpu and tf-nightly-gpu. These...
Bumps [numpy](https://github.com/numpy/numpy) from 1.14.0 to 1.22.0. Release notes Sourced from numpy's releases. v1.22.0 NumPy 1.22.0 Release Notes NumPy 1.22.0 is a big release featuring the work of 153 contributors spread...
Can you explain why this reward function (```-cos(theta)-0.1*sin(theta) ... ```) is used for pendulum? https://github.com/WilsonWangTHU/POPLIN/blob/edd8dba50f9049c6164eda774602bef0c299cb51/dmbrl/config/gym_pendulum.py#L104 And why does it need to be different from the original reward function from openai-gym?
Please correct me if I am wrong. In the Poplin-P AVG-R example, ```data_dict``` passed to ```train``` function of ```BC_WA_policy.policy_network``` contains the noise parameters searched by CEM and they can add...