texture_nets icon indicating copy to clipboard operation
texture_nets copied to clipboard

invalid device ordinal

Open wq409813230 opened this issue 7 years ago • 6 comments

I have installed all dependencies that this repository need,but something gose wrong when running the command bellow:

th test.lua -input_image /data/artwork/content/huaban.jpeg -model_t7 data/checkpoints/model.t7 -gpu 0

THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-3022/cutorch/init.c line=734 error=10 : invalid device ordinal /root/AI/torch/install/bin/luajit: test.lua:26: cuda runtime error (10) : invalid device ordinal at /tmp/luarocks_cutorch-scm-1-3022/cutorch/init.c:734 stack traceback: [C]: in function 'setDevice' test.lua:26: in main chunk [C]: in function 'dofile' ...t/AI/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00406670 ` bellow is my GPU info

`+-----------------------------------------------------------------------------+ | NVIDIA-SMI 375.26 Driver Version: 375.26 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Quadro P5000 Off | 0000:03:00.0 On | Off | | 26% 39C P8 8W / 180W | 110MiB / 16264MiB | 0% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1387 G /usr/lib/xorg/Xorg 108MiB | +-----------------------------------------------------------------------------+ ` I really have no idea where the problem is.

wq409813230 avatar Jun 22 '17 02:06 wq409813230

Hi, I think torch enumerates GPU' from 1. If you have only one GPU you can omit this argument.

On Thu, 22 Jun 2017, 05:58 吴强, [email protected] wrote:

I have installed all dependencies that this repository need,but something gose wrong when running the command bellow:

th test.lua -input_image /data/artwork/content/huaban.jpeg -model_t7 data/checkpoints/model.t7 -gpu 0

THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-3022/cutorch/init.c line=734 error=10 : invalid device ordinal /root/AI/torch/install/bin/luajit: test.lua:26: cuda runtime error (10) : invalid device ordinal at /tmp/luarocks_cutorch-scm-1-3022/cutorch/init.c:734 stack traceback: [C]: in function 'setDevice' test.lua:26: in main chunk [C]: in function 'dofile' ...t/AI/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00406670 ` bellow id my GPU info

`+-----------------------------------------------------------------------------+ | NVIDIA-SMI 375.26 Driver Version: 375.26 |

|-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |

|===============================+======================+======================| | 0 Quadro P5000 Off | 0000:03:00.0 On | Off | | 26% 39C P8 8W / 180W | 110MiB / 16264MiB | 0% Default |

+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage |

|=============================================================================| | 0 1387 G /usr/lib/xorg/Xorg 108MiB |

+-----------------------------------------------------------------------------+ ` I really have no idea where the problem is.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/DmitryUlyanov/texture_nets/issues/74, or mute the thread https://github.com/notifications/unsubscribe-auth/AGanZC2NPnLy6b5NHRpoNgBQuuVvemmpks5sGdhIgaJpZM4OBwLe .

-- Best, Dmitry

DmitryUlyanov avatar Jun 22 '17 04:06 DmitryUlyanov

Hi,Dear Dmitry,thank you for your reply.but it still failed when I ignore the -gpu argument.what makes me confused is that the chainer-fast-neuralstyle implemented with python also has the '-gpu' argument, and it runs well when I set -gpu 0. qq 20170622160836

wq409813230 avatar Jun 22 '17 08:06 wq409813230

Hi , This issue still persist any one found a solution for it

engahmed1190 avatar Oct 19 '17 19:10 engahmed1190

the gpu index starts from 1, pls try to use option -gpu 1 instead of -gpu 0

gxlcliqi avatar Nov 06 '17 08:11 gxlcliqi

i also get this error. whatever gpu id i input. cudnn works fine on chainer.

my setup info: ubuntu 16.04 torch7 cuda9.2 cudnn7.1.4 `` +-----------------------------------------------------------------------------+ | NVIDIA-SMI 396.26 Driver Version: 396.26 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 970 Off | 00000000:01:00.0 On | N/A | | 0% 46C P8 17W / 163W | 455MiB / 4040MiB | 1% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 958 G /usr/lib/xorg/Xorg 287MiB | | 0 1897 G compiz 164MiB | +-----------------------------------------------------------------------------+ ``

i think it might be because of torch7 being by default for cudnn r5 ?!

i had to run git clone https://github.com/soumith/cudnn.torch.git -b R7 && cd cudnn.torch && luarocks make cudnn-scm-1.rockspec to get cudnn7 recognized by torch. and had to re-do luarocks install cunn and luarocks install cutorch after that, but now get this same "invalid device ordinal" error.

maybe it's having some sort of version mismatch of cudnn cunn and cutorch? don't know where the cunn.torch and cutorch.torch compliant with cudnn.torch R7 might be located. anyone has any clue?

i'm not used to ubuntu and lua :S

psenough avatar Jul 16 '18 05:07 psenough

found https://github.com/torch/cutorch/issues and yeah, doesn't look like they support cuda 9 yet, that's probably the issue here i think. :/ if anyone has any other insights beyond "try downgrading", i'd appreciate the input.

psenough avatar Jul 16 '18 06:07 psenough