Duncan Riach comments

Results 68 comments of


                                            Duncan Riach

Early versions of TensorFlow (e.g. 1.9) do not have a version attribute

Actually, I'm going to leave this open so that it serves as a marker to possibly address this issue in a future release. We could check if there is a...

Early versions of TensorFlow (e.g. 1.9) do not have a version attribute

Thank you, @SageAgastya, for bringing this to my attention.

Changing class name/structure changes program functionality (using TensorFlow)

Hi Edward, I've done a quick triage on this issue: * This issue is related to TensorFlow (not PyTorch or another framework). * This issue is not about run-to-run reproducibility....

Getting OpenNMT-tf to train reproducibly

Hi @atebbifakhr, My understanding is that XLA JIT compilation is not currently enabled by default in TensorFlow. I assume that you're not enabling XLA and therefore that, if there is...

Getting OpenNMT-tf to train reproducibly

Until now, I was unaware of non-determinism issues with `tf.nn.softmax_cross_entropy_with_logits`, but I have started digging into it, and will add it to a list of things to look at and...

Getting OpenNMT-tf to train reproducibly

Thanks for providing that code, @atebbifakhr! Nice and simple and self-contained. I love it. I have been able to reproduce the non-determinism, but not the _determinism_ when the cross-entropy op...

Getting OpenNMT-tf to train reproducibly

Hey, I'm running this locally so that I can instrument and debug it. My machine contains a 12GB TITAN V. I'm getting this error: ``` tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor...

Getting OpenNMT-tf to train reproducibly

In the model, I reduced `num_units` from 512 down to 32 and `ffn_inner_dim` from 2048 down to 128 for both the encoder and the decoder. This resolved the problem. The...

Getting OpenNMT-tf to train reproducibly

Hey @atebbifakhr, Sorry, I have not gotten to this yet. Will do as soon as I can and get back to you.

Getting OpenNMT-tf to train reproducibly

Hi @atebbifakhr, I looked into this more deeply. Removing `tf.nn.sparse_softmax_cross_entropy_with_logits` from the loss function only makes the gradients reproducible for the first step. They still go non-deterministic on the second...