hindsight-experience-replay
hindsight-experience-replay copied to clipboard
what is the definition of actor_loss in ddpg_agent,.py?
actor_loss = -self.critic_network(inputs_norm_tensor, actions_real).mean()
actor_loss += self.args.action_l2 * (actions_real / self.env_params['action_max']).pow(2).mean()
I think the output of critic_network is enough to be the actor_loss. So is it a regularizer or trick? it would be better for me to reply in Chinese.
@whynpt It's more like a regularizer, make sure the action will not move too much.