DeepRL-Agents
DeepRL-Agents copied to clipboard
A3C Basic Doom: Loss Function
Hi
Our goal is to minimize the loss. Loss consists of three parts:
- Value loss
- Policy loss
- Entropy (to encourage exploration)
As follows:
self.value_loss = 0.5 * tf.reduce_sum(tf.square(self.target_v - tf.reshape(self.value,[-1])))
self.entropy = - tf.reduce_sum(self.policy * tf.log(self.policy))
self.policy_loss = -tf.reduce_sum(tf.log(self.responsible_outputs)*self.advantages)
self.loss = 0.5 * self.value_loss + self.policy_loss - self.entropy * 0.01
Last line: self.loss = 0.5 * self.value_loss + self.policy_loss - self.entropy * 0.01
I think we do not need to multiply self.value_loss by 0.5 in the last line, correct?
0.5 is the coefficient of loss of value, there is another post which introduced this.
However, why there is a minus sign in - self.entropy * 0.01, since the loss formula is
Loss = Loss_policy + c_value * Loss_value + c_entropy * Loss_entropy
And
self.value_loss = 0.5 * tf.reduce_sum(tf.square(self.target_v - tf.reshape(self.value,[-1])))
Why 0.5 here is needed?