Pytorch-AWAC A question about updating the critic network in AWAC

A question about updating the critic network in AWAC

Open MISTCARRYYOU opened this issue 2 years ago • 0 comments

Hi Park, Thanks for your codes! However, I found that the updating of the value function in your codes seems to be different from the original paper:

In the original paper, it tells that the critic network is updated by the offline dataset D rather than the scalable dataset \beta, while in your realization, the updating of the critic network (see agent.update_critic()) utilizes the \beta dataset. So I wonder is there something wrong?

May 11 '22 13:05 MISTCARRYYOU

Pytorch-AWAC Pytorch-AWAC copied to clipboard

A question about updating the critic network in AWAC

Pytorch-AWAC
Pytorch-AWAC copied to clipboard