PCGrad icon indicating copy to clipboard operation
PCGrad copied to clipboard

shuffle stacked loss

Open cfifty opened this issue 4 years ago • 4 comments

Consider replacing

tf.random.shuffle(loss)

with

loss = tf.gather(loss, tf.random.shuffle(tf.range(tf.shape(loss)[0])))

cfifty avatar Jun 11 '20 21:06 cfifty

Hi @cfifty , May I ask why not replacing with loss=tf.random.shuffle(loss)?

luzai avatar Aug 16 '20 07:08 luzai

  1. In non-eager mode, tf.random.shuffle(loss) is never called, so the loss list is not shuffled if you use graph mode TensorFlow.

  2. If you use loss=tf.random.shuffle(loss), the backwards pass of tf.random.shuffle is not defined. Thus, you can't compute gradients through this operation and an error is thrown. See https://stackoverflow.com/questions/55701407/how-to-shuffle-tensor-in-tensorflow-errorno-gradient-defined-for-operation-ra for additional context.

cfifty avatar Aug 16 '20 07:08 cfifty

Thank you very much for your detailed explanation!

  1. The loss list is not shuffled if using tf.random.shuffle(loss). For the reason, I think, tf.random.shuffle is not an inplace operation, and thus the input argument loss is not shuffled.
  2. It seems loss=tf.random.shuffle(loss) do not throw an error with tf 1.15.3. Maybe in the new version, the gradient for this operation is registered. Overall, I think loss = tf.gather(loss, tf.random.shuffle(tf.range(tf.shape(loss)[0]))) is a greater choice for compatibility,

luzai avatar Aug 16 '20 10:08 luzai

I am sorry that I made a mistake, the gradient operation is still not defined for loss=tf.random.shuffle(loss) in tf 1.15.3. image

We should consider use loss = tf.gather(loss, tf.random.shuffle(tf.range(tf.shape(loss)[0]))) instead.

luzai avatar Aug 19 '20 08:08 luzai