Duncan Riach
Duncan Riach
Actually, I'm going to leave this open so that it serves as a marker to possibly address this issue in a future release. We could check if there is a...
Thank you, @SageAgastya, for bringing this to my attention.
Hi Edward, I've done a quick triage on this issue: * This issue is related to TensorFlow (not PyTorch or another framework). * This issue is not about run-to-run reproducibility....
Hi @atebbifakhr, My understanding is that XLA JIT compilation is not currently enabled by default in TensorFlow. I assume that you're not enabling XLA and therefore that, if there is...
Until now, I was unaware of non-determinism issues with `tf.nn.softmax_cross_entropy_with_logits`, but I have started digging into it, and will add it to a list of things to look at and...
Thanks for providing that code, @atebbifakhr! Nice and simple and self-contained. I love it. I have been able to reproduce the non-determinism, but not the _determinism_ when the cross-entropy op...
Hey, I'm running this locally so that I can instrument and debug it. My machine contains a 12GB TITAN V. I'm getting this error: ``` tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor...
In the model, I reduced `num_units` from 512 down to 32 and `ffn_inner_dim` from 2048 down to 128 for both the encoder and the decoder. This resolved the problem. The...
Hey @atebbifakhr, Sorry, I have not gotten to this yet. Will do as soon as I can and get back to you.
Hi @atebbifakhr, I looked into this more deeply. Removing `tf.nn.sparse_softmax_cross_entropy_with_logits` from the loss function only makes the gradients reproducible for the first step. They still go non-deterministic on the second...