WAYKEN-TSE

Results 1 issues of WAYKEN-TSE

i know that the this function is used to calculate the extrinsic reward, but when doing PPO to update the network, the advantage function only include the intrinsic reward(advantages =...