dqn Question about clipping

Question about clipping

Open aisurfer opened this issue 7 years ago • 0 comments

Hi! Could you please explain why you make error clipping such a way? ` $ git diff a83c4b359b9 ... .- # Clip the error term to be between -1 and 1 .- error = y - q_value .- clipped_error = tf.clip_by_value(error, -1, 1) .- loss = tf.reduce_mean(tf.square(clipped_error)) .+ # Clip the error, the loss is quadratic when the error is in (-1, 1), and linear outside of that region .+ error = tf.abs(y - q_value) .+ quadratic_part = tf.clip_by_value(error, 0.0, 1.0) .+ linear_part = error - quadratic_part .+ loss = tf.reduce_mean(0.5 * tf.square(quadratic_part) + linear_part)

` It seems like good improvement but not like in original paper. What is the benefit? Thanks!

Feb 22 '18 22:02 aisurfer

dqn dqn copied to clipboard

Question about clipping

dqn
dqn copied to clipboard