Deep_reinforcement_learning_Course Possible mistake in Deep Q Learning Space Invaders notebook

Possible mistake in Deep Q Learning Space Invaders notebook

Open karolisjan opened this issue 6 years ago • 1 comments

Hey. Shouldn't self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_)) in DQN class be self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_), axis=1), i.e. reduced along columns so that the output length of self.Q is equal to the batch size? If not then self.Q will be a scalar while self.target_Q will be a vector of batch size length.

Apr 22 '19 10:04 karolisjan

@karolisjan I agree.

May 02 '19 03:05 ali-ehsan

Deep_reinforcement_learning_Course Deep_reinforcement_learning_Course copied to clipboard

Possible mistake in Deep Q Learning Space Invaders notebook

Deep_reinforcement_learning_Course
Deep_reinforcement_learning_Course copied to clipboard