WAYKEN-TSE
Results
1
issues of
WAYKEN-TSE
i know that the this function is used to calculate the extrinsic reward, but when doing PPO to update the network, the advantage function only include the intrinsic reward(advantages =...