fast-neural-style icon indicating copy to clipboard operation
fast-neural-style copied to clipboard

What version(s) of CUDA and cudNN are supported/recommended?

Open 3DTOPO opened this issue 7 years ago • 4 comments

I am trying to get fast-neural-style running on MacOS with a NVIDIA GeForce GTX 1060.

I was able to get everything built using Cuda 9.1 and cudNN 5.1, but I get the following out of memory error trying to run on the GPU:

THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-5631/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory THNN.lua:110: cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-5631/cutorch/lib/THC/generic/THCStorage.cu:66

Since luarocks cudnn seemed to require cudNN 5.1, I decided to try installing CUDA 8.0 (since it appears cudNN 5.1 is built for CUDA 7.5 and 8.0). But when trying to build cutorch against CUDA 8.0 the config script hangs. I am guessing 8.0 doesn't support my card?

Anyhow, can anyone clarify what versions of CUDA and cudNN I should be using? Is there a way to disable cudNN (after it was installed) to see if it works without it?

Thanks!

3DTOPO avatar Dec 13 '17 22:12 3DTOPO

p.s. the training I was attempting when it ran out of memory should have taken around 3GB and the card has 6GB. I also tried setting the batch_size to 1 instead of 4 using a training set of 256px which should have used even less than 3GB.

3DTOPO avatar Dec 13 '17 22:12 3DTOPO

It seems that different combinations of Cuda, cuDNN, Torch7, and possibly OS versions as well, have different performance in a way that one might not expect: https://github.com/jcjohnson/neural-style/issues/429

Newer versions of Cuda/cuDNN/Torch7 seem to use more memory than previous versions.

ProGamerGov avatar Jan 09 '18 22:01 ProGamerGov

FYI: I am training with batch_size 2 and style_image_size 256 on a GTX 1060 with 3GB. I use ubuntu with nvidia-390, nvidia-cuda-dev v 7.5.18, libcudnn7 /7.0.5.15 and cuda_9.1.85

flaushi avatar Feb 15 '18 11:02 flaushi

However, I do have to say that exactly this configuration does have problems with instance normalization! The eccv-models work well, also training without instance normalization seems to give reasonable results.

flaushi avatar Mar 14 '18 16:03 flaushi