Pytorch-AWAC
Pytorch-AWAC copied to clipboard
A question about updating the critic network in AWAC
Hi Park, Thanks for your codes! However, I found that the updating of the value function in your codes seems to be different from the original paper:
In the original paper, it tells that the critic network is updated by the offline dataset D rather than the scalable dataset \beta, while in your realization, the updating of the critic network (see agent.update_critic()) utilizes the \beta dataset. So I wonder is there something wrong?