finetune-transformer-lm
finetune-transformer-lm copied to clipboard
About the non-determinism due to GPU ops
Hi,
I understand that there is a non-determinism due to GPU ops and I observed this as well when running twice the same code on the same GPU gave significant different results. However, I was wondering why the pytorch re-implementation https://github.com/huggingface/pytorch-openai-transformer-lm is actually giving the same results when running twice in a raw. Could it be that I am using a "wrong" version of TF? I have tensorflow-gpu 1.4.0, python 3.6, cuda 8.0 and cudnn 6.0. Thanks!