PCGrad
PCGrad copied to clipboard
shuffle stacked loss
Consider replacing
tf.random.shuffle(loss)
with
loss = tf.gather(loss, tf.random.shuffle(tf.range(tf.shape(loss)[0])))
Hi @cfifty , May I ask why not replacing with loss=tf.random.shuffle(loss)
?
-
In non-eager mode, tf.random.shuffle(loss) is never called, so the loss list is not shuffled if you use graph mode TensorFlow.
-
If you use
loss=tf.random.shuffle(loss)
, the backwards pass of tf.random.shuffle is not defined. Thus, you can't compute gradients through this operation and an error is thrown. See https://stackoverflow.com/questions/55701407/how-to-shuffle-tensor-in-tensorflow-errorno-gradient-defined-for-operation-ra for additional context.
Thank you very much for your detailed explanation!
- The loss list is not shuffled if using
tf.random.shuffle(loss)
. For the reason, I think, tf.random.shuffle is not an inplace operation, and thus the input argumentloss
is not shuffled. - It seems
loss=tf.random.shuffle(loss)
do not throw an error withtf 1.15.3
. Maybe in the new version, the gradient for this operation is registered. Overall, I thinkloss = tf.gather(loss, tf.random.shuffle(tf.range(tf.shape(loss)[0])))
is a greater choice for compatibility,
I am sorry that I made a mistake, the gradient operation is still not defined for loss=tf.random.shuffle(loss)
in tf 1.15.3
.
We should consider use loss = tf.gather(loss, tf.random.shuffle(tf.range(tf.shape(loss)[0])))
instead.