dqn
dqn copied to clipboard
Question about clipping
Hi! Could you please explain why you make error clipping such a way? ` $ git diff a83c4b359b9 ... .- # Clip the error term to be between -1 and 1 .- error = y - q_value .- clipped_error = tf.clip_by_value(error, -1, 1) .- loss = tf.reduce_mean(tf.square(clipped_error)) .+ # Clip the error, the loss is quadratic when the error is in (-1, 1), and linear outside of that region .+ error = tf.abs(y - q_value) .+ quadratic_part = tf.clip_by_value(error, 0.0, 1.0) .+ linear_part = error - quadratic_part .+ loss = tf.reduce_mean(0.5 * tf.square(quadratic_part) + linear_part)
` It seems like good improvement but not like in original paper. What is the benefit? Thanks!