Deep_reinforcement_learning_Course
Deep_reinforcement_learning_Course copied to clipboard
Possible mistake in Deep Q Learning Space Invaders notebook
Hey. Shouldn't self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_)) in DQN class be self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_), axis=1), i.e. reduced along columns so that the output length of self.Q is equal to the batch size? If not then self.Q will be a scalar while self.target_Q will be a vector of batch size length.
@karolisjan I agree.