neural-style icon indicating copy to clipboard operation
neural-style copied to clipboard

Error in CuDNN: CUDNN_STATUS_ALLOC_FAILED

Open ProGamerGov opened this issue 9 years ago • 2 comments
trafficstars

On a fresh install of Neural-Style, I have been trying to use CuDNN, but I am receiving the following error: Error in CuDNN: CUDNN_STATUS_ALLOC_FAILED

The full logs are below:

user@user-XPS-8500:~/neural-style$ th neural_style.lua -image_size 256 -gpu 0 -backend cudnn
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message.  If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
Setting up style layer      2   :   relu1_1 
Setting up style layer      7   :   relu2_1 
Setting up style layer      12  :   relu3_1 
Setting up style layer      21  :   relu4_1 
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-2986/cutorch/lib/THC/generic/THCStorage.cu line=40 error=2 : out of memory
/home/user/torch/install/bin/luajit: ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:142: cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-2986/cutorch/lib/THC/generic/THCStorage.cu:40
stack traceback:
    [C]: in function 'resize'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:142: in function 'createIODescriptors'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:364: in function 'updateOutput'
    /home/user/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    neural_style.lua:204: in function 'main'
    neural_style.lua:500: in main chunk
    [C]: in function 'dofile'
    .../user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670
user@user-XPS-8500:~/neural-style$ 

and

user@user-XPS-8500:~/neural-style$ th neural_style.lua -image_size 256 -gpu 0 -backend cudnn -cudnn_autotune -optimizer lbfgs
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message.  If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
Setting up style layer      2   :   relu1_1 
Setting up style layer      7   :   relu2_1 
Setting up style layer      12  :   relu3_1 
/home/user/torch/install/bin/luajit: /home/user/torch/install/share/lua/5.1/cudnn/init.lua:58: Error in CuDNN: CUDNN_STATUS_ALLOC_FAILED
stack traceback:
    [C]: in function 'error'
    /home/user/torch/install/share/lua/5.1/cudnn/init.lua:58: in function 'errcheck'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:186: in function 'createIODescriptors'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:364: in function 'updateOutput'
    /home/user/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    neural_style.lua:204: in function 'main'
    neural_style.lua:500: in main chunk
    [C]: in function 'dofile'
    .../user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670
user@user-XPS-8500:~/neural-style$ 

and

user@user-XPS-8500:~/neural-style$ th neural_style.lua -image_size 256 -gpu 0 -backend cudnn -cudnn_autotune -optimizer adam
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message.  If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
Setting up style layer      2   :   relu1_1 
Setting up style layer      7   :   relu2_1 
Setting up style layer      12  :   relu3_1 
/home/user/torch/install/bin/luajit: /home/user/torch/install/share/lua/5.1/cudnn/init.lua:58: Error in CuDNN: CUDNN_STATUS_ALLOC_FAILED
stack traceback:
    [C]: in function 'error'
    /home/user/torch/install/share/lua/5.1/cudnn/init.lua:58: in function 'errcheck'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:186: in function 'createIODescriptors'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:364: in function 'updateOutput'
    /home/user/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    neural_style.lua:204: in function 'main'
    neural_style.lua:500: in main chunk
    [C]: in function 'dofile'
    .../user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670
user@user-XPS-8500:~/neural-style$ 

and finally:

user@user-XPS-8500:~$ nvidia-smi
Thu Mar  3 19:58:45 2016       
+------------------------------------------------------+                       
| NVIDIA-SMI 352.63     Driver Version: 352.63         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 660     Off  | 0000:01:00.0     N/A |                  N/A |
| 28%   38C    P8    N/A /  N/A |    289MiB /  1532MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+
user@user-XPS-8500:~$ 

Not sure if my card is the problem?

ProGamerGov avatar Mar 07 '16 17:03 ProGamerGov

This issue appears related to issue https://github.com/jcjohnson/neural-style/issues/116

ProGamerGov avatar Mar 07 '16 17:03 ProGamerGov

You need to increase your DDR3 or DDR4 RAM to 32GB, 64GB or 128GB. This is not about the memory of the GPU

jameswan avatar Nov 26 '18 10:11 jameswan