Duncan Riach

Results 68 comments of Duncan Riach

Hi @atebbifakhr, After further investigation, there seems to be two or three sources of non-determinism in this system. 1. Confirmed that back-prop of `tf.nn.sparse_softmax_cross_entropy_with_logits` does inject non-determinism. Opened TensorFlow [issue...

I updated my previous comment to include additional information that come from more investigation.

Beautifully presented. Thanks, @MFreidank. I made a copy of your colab code and have been looking at it. The primary issue right now is that the trainable variables are not...

Oh, I see. You have to restart the runtime to get the same initial trainable variables. I can hopefully provide a work-around for that too.

So, the solution for getting the same initial trainable variables every time you run the block of code that starts with the definition of `summarize_keras_weights` is to call `tf.random.set_seed` at...

And ... solved. By removing `from_logits=True` from the constructor of `tf.keras.losses.SparseCategoricalCrossentropy()` I was able to get the same trainable variables after both runs. ``` ### Before training: ### Summary of...

Please confirm that your issue has been solved. Train your model for much longer, at least for one whole epoch, and confirm that it's getting the accuracy you expect while...

> Could you have a look at this? Will do. > Almost looks like Keras is doing some non-deterministic operations in between epochs. These between-epoch issues are common and there...

Running in colab, with my old copy of your code (with the fixes), I'm now no longer seeing reproducibility on 5 steps in one epoch on GPU. This is very...

Just to recap where we're at and the solutions we have: 1. Using `tf.random.set_seed`, reset TensorFlow's PRNG before initializing trainable variables. 2. Set `TF_DETERMINISTIC_OPS=1` to enable all deterministic ops in...