neuraltalk2
neuraltalk2 copied to clipboard
THCTensorMathPointwise.cu line=40 error=8 : invalid device function
while running the following command I am getting error. And not able to run the NeuralTalk2 demo.
th eval.lua -model ~/model_id1-501-1448236541.t7 -image_folder ~/iot/images/ -num_images 10
DataLoaderRaw loading images from folder: /home/.../iot/images/
listing all images in directory /home/.../iot/images/
DataLoaderRaw found 10 images
constructing clones inside the LanguageModel
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-5816/cutorch/lib/THC/generic/THCTensorMathPointwise.cu line=40 error=8 : invalid device function /home/ptcuser/torch/install/bin/luajit: ./misc/net_utils.lua:75: cuda runtime error (8) : invalid device function at /tmp/luarocks_cutorch-scm-1-5816/cutorch/lib/THC/generic/THCTensorMathPointwise.cu:40 stack traceback: [C]: in function 'add' ./misc/net_utils.lua:75: in function 'prepro' eval.lua:117: in function 'eval_split' eval.lua:173: in main chunk [C]: in function 'dofile' ...user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670
Thanks in advance!!
Hi, I have the same issue. Is it related to Torch installation, or rather the GPU compute capability?
I'm also having a very similar problem of invalid device function coming up -- it seems like when I disable the use of cuda() and CudaTensors, some of the problems are going away, but I'm not sure what a more viable fix is, since I need to be using CUDA eventually.
I have a similar problem THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-5062/cutorch/lib/THC/THCTensorCopy.cu line=205 error=8 : invalid device function /home/nady/newriver/torch/install/bin/luajit: ./misc/net_utils.lua:31: cuda runtime error (8) : invalid device function at /tmp/luarocks_cutorch-scm-1-5062/cutorch/lib/THC/THCTensorCopy.cu:205 stack traceback: [C]: in function 'copy' ./misc/net_utils.lua:31: in function 'build_cnn' train.lua:122: in main chunk
You should definitely check your NVIDIA drivers and compatibility of your GPUs with your code. Make sure all the right dependencies, i.e. CUDA, etc. are all install properly. Also, it may help to reinstall Torch.
Did this ever find a resolution?
I've got 2 GPUs in my machine. One is a GTX 980ti and the other is a newer GTX 1070
If I run it on the 980ti it works, but on the GTX 1070 I get the:
THCTensorMathPointwise.cu error=8 : invalid device function
I'm on Ubuntu 16.04 with CUDA 8 and cudnn 5.1. and driver version 367.57
So for anybody who comes across this. The solution was to re-install cutorch and cunn.
It seems that luarocks lazy-compiles only for those cards found in the machine at the time Torch is installed. In my case, I had originally installed Torch with just my GTX 980ti card. (which is 5.2 compute compatible.)
The GTX 1070 (and probably 1060 & 1080) are 6.2 compute capable.
The problem went away after I did:
luarocks install cutorch
luarocks install cunn
This recompiles them them with the appropriate compute capabilities.
Here's some of the verbaige from the compile. Notices how it is 'autodetecting' the various card capabilities.
~~~ snip ~~~
-- MAGMA not found. Compiling without MAGMA support
-- Autodetected CUDA architecture(s): 6.1 3.5 5.2
~~~ snip ~~~
See here for more details
@filmo Thanks. I ran into same issue when I move my container beween VMs. reinstall works.
You may run into compiler error during reinstall in that case you need to reinstall torch first.
luarocks install torch