gated-attention-reader icon indicating copy to clipboard operation
gated-attention-reader copied to clipboard

Very slow on GPU

Open EvGe22 opened this issue 7 years ago • 1 comments

I'm trying to train the net using a Tesla K80 and the performance is pretty sad. I get around 8 seconds per iteration. I've seen the same issue in the original theano implementation repo and there the problem was either with float precision or the blas lib. I've got a blas lib in place and I don't think the float problem is relevant to TensorFlow since we use float32 everywhere already. What do you think can cause this?

I've also encountered a problem with training crashing while running on the validation set and/or saving the model. And It's a bit random. It crashed on eval_every=10000 and 5000, but didn't on eval_every=100. Cannon provide any error messages since I didn't capture any. I don't think it's related to memory since I'm actually using two K80s with total memory of 24Gb. The RAM isn't the case also, I've got plenty of it. Any ideas?

Just saving the model without running the validation set on eval_every=1500 works right now, didn't get any errors yet.

The CUDA version is 7.5 and cuDNN is 5.1.3. Should I just update it maybe? TF version is r1.3 Python 3.6 and the libraries are from the latest anaconda3

EvGe22 avatar Jul 26 '17 12:07 EvGe22