awr
awr copied to clipboard
Implementation of advantage-weighted regression.
Hello, thanks for the code, while I tried to re-implement the program, I find that there is one step to normalize value function vf [here](https://github.com/xbpeng/awr/blob/831442fb8d4c24bd200667cbc5e458c7657effc2/learning/rl_agent.py#L230-L234) . It's implementated by `v_predict...
Hi, I am trying to modify AWR into the offline version (or fully off-policy version). I find that the paper states that one can simply treat the dataset as the...
Hi, Thank you for sharing the repo! I was wondering how the Train_Return and Test_Return is calculated and what the difference between the two. I see that one is using...
Hello, I am trying to use this algorithm (rewritten in PyTorch with Gym vectorized envs) for motion imitation, starting with the PyBullet implementation of the DeepMimic environment. In the paper,...