waifu2x icon indicating copy to clipboard operation
waifu2x copied to clipboard

Selecting GPU??

Open AskAlice opened this issue 7 years ago • 4 comments

hi, I am getting this error

alice@alice-pc:~/waifu2x$ th waifu2x.lua -force_cudnn 1

/home/alice/torch/install/bin/luajit: /home/alice/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 2 module of nn.Sequential:
/home/alice/torch/install/share/lua/5.1/nn/LeakyReLU.lua:19: attempt to index field 'THNN' (a nil value)
stack traceback:
	/home/alice/torch/install/share/lua/5.1/nn/LeakyReLU.lua:19: in function </home/alice/torch/install/share/lua/5.1/nn/LeakyReLU.lua:18>
	[C]: in function 'xpcall'
	/home/alice/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	/home/alice/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	lib/reconstruct.lua:47: in function 'reconstruct_nn'
	lib/reconstruct.lua:167: in function 'scale_rgb'
	lib/reconstruct.lua:207: in function 'scale_f'
	waifu2x.lua:99: in function 'convert_image'
	waifu2x.lua:291: in function 'waifu2x'
	waifu2x.lua:296: in main chunk
	[C]: in function 'dofile'
	...lice/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
	[C]: at 0x556172e3b470

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
	[C]: in function 'error'
	/home/alice/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
	/home/alice/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	lib/reconstruct.lua:47: in function 'reconstruct_nn'
	lib/reconstruct.lua:167: in function 'scale_rgb'
	lib/reconstruct.lua:207: in function 'scale_f'
	waifu2x.lua:99: in function 'convert_image'
	waifu2x.lua:291: in function 'waifu2x'
	waifu2x.lua:296: in main chunk
	[C]: in function 'dofile'
	...lice/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
	[C]: at 0x556172e3b470

if I run nvidia-smi, i get this:

alice@alice-pc:~/waifu2x$ nvidia-smi
Thu Mar 30 21:51:35 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57                 Driver Version: 367.57                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|

|   0  GeForce GTX 660     Off  | 0000:01:00.0     N/A |                  N/A |
| 37%   53C    P0    N/A /  N/A |   1289MiB /  1990MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 0000:02:00.0     Off |                  N/A |
|  0%   45C    P0    44W / 180W |      0MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+
alice@alice-pc:~/waifu2x$ lspci -k | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GK106 [GeForce GTX 660] (rev a1)
02:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)

The error seems to a little bit different if I don't force cudnn, but still similar. For reference, I am able to run neural-style when i use the arguments to select the gtx 1080

AskAlice avatar Mar 31 '17 01:03 AskAlice

-gpu option is supported in dev branch.

git fetch
git checkout -b dev origin/dev
th waifu2x.lua -gpu 2

nagadomi avatar Mar 31 '17 18:03 nagadomi

The bad news is I still get that error :laughing: The good news is this gpu option seems to work.

alice@alice-pc:~/waifu2x$ th waifu2x.lua -gpu 0 -force_cudnn 1
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-3035/cutorch/init.c line=722 error=10 : invalid device ordinal
/home/alice/torch/install/bin/luajit: waifu2x.lua:279: cuda runtime error (10) : invalid device ordinal at /tmp/luarocks_cutorch-scm-1-3035/cutorch/init.c:722
stack traceback:
	[C]: in function 'setDevice'
	waifu2x.lua:279: in function 'waifu2x'
	waifu2x.lua:298: in main chunk
	[C]: in function 'dofile'
	...lice/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
	[C]: at 0x55a17b5e4470



alice@alice-pc:~/waifu2x$ th waifu2x.lua -gpu 1 -force_cudnn 1

/home/alice/torch/install/bin/luajit: /home/alice/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 2 module of nn.Sequential:
/home/alice/torch/install/share/lua/5.1/nn/LeakyReLU.lua:19: attempt to index field 'THNN' (a nil value)
stack traceback:
	/home/alice/torch/install/share/lua/5.1/nn/LeakyReLU.lua:19: in function </home/alice/torch/install/share/lua/5.1/nn/LeakyReLU.lua:18>
	[C]: in function 'xpcall'
	/home/alice/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	/home/alice/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	lib/reconstruct.lua:56: in function 'reconstruct_nn'
	lib/reconstruct.lua:181: in function 'scale_rgb'
	lib/reconstruct.lua:227: in function 'scale_f'
	waifu2x.lua:99: in function 'convert_image'
	waifu2x.lua:293: in function 'waifu2x'
	waifu2x.lua:298: in main chunk
	[C]: in function 'dofile'
	...lice/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
	[C]: at 0x559e89a3c470

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
	[C]: in function 'error'
	/home/alice/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
	/home/alice/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	lib/reconstruct.lua:56: in function 'reconstruct_nn'
	lib/reconstruct.lua:181: in function 'scale_rgb'
	lib/reconstruct.lua:227: in function 'scale_f'
	waifu2x.lua:99: in function 'convert_image'
	waifu2x.lua:293: in function 'waifu2x'
	waifu2x.lua:298: in main chunk
	[C]: in function 'dofile'
	...lice/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
	[C]: at 0x559e89a3c470



alice@alice-pc:~/waifu2x$ th waifu2x.lua -gpu 2 -force_cudnn 1

/home/alice/torch/install/bin/luajit: /home/alice/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 2 module of nn.Sequential:
/home/alice/torch/install/share/lua/5.1/nn/LeakyReLU.lua:19: attempt to index field 'THNN' (a nil value)
stack traceback:
	/home/alice/torch/install/share/lua/5.1/nn/LeakyReLU.lua:19: in function </home/alice/torch/install/share/lua/5.1/nn/LeakyReLU.lua:18>
	[C]: in function 'xpcall'
	/home/alice/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	/home/alice/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	lib/reconstruct.lua:56: in function 'reconstruct_nn'
	lib/reconstruct.lua:181: in function 'scale_rgb'
	lib/reconstruct.lua:227: in function 'scale_f'
	waifu2x.lua:99: in function 'convert_image'
	waifu2x.lua:293: in function 'waifu2x'
	waifu2x.lua:298: in main chunk
	[C]: in function 'dofile'
	...lice/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
	[C]: at 0x561c5ce0d470

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
	[C]: in function 'error'
	/home/alice/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
	/home/alice/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	lib/reconstruct.lua:56: in function 'reconstruct_nn'
	lib/reconstruct.lua:181: in function 'scale_rgb'
	lib/reconstruct.lua:227: in function 'scale_f'
	waifu2x.lua:99: in function 'convert_image'
	waifu2x.lua:293: in function 'waifu2x'
	waifu2x.lua:298: in main chunk
	[C]: in function 'dofile'
	...lice/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
	[C]: at 0x561c5ce0d470
alice@alice-pc:~/waifu2x$ 



alice@alice-pc:~/waifu2x$ th waifu2x.lua -gpu 3 -force_cudnn 1
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-3035/cutorch/init.c line=722 error=10 : invalid device ordinal
/home/alice/torch/install/bin/luajit: waifu2x.lua:279: cuda runtime error (10) : invalid device ordinal at /tmp/luarocks_cutorch-scm-1-3035/cutorch/init.c:722
stack traceback:
	[C]: in function 'setDevice'
	waifu2x.lua:279: in function 'waifu2x'
	waifu2x.lua:298: in main chunk
	[C]: in function 'dofile'
	...lice/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
	[C]: at 0x55a524e95470

AskAlice avatar Mar 31 '17 20:03 AskAlice

To be fair, this doesn't seem like an issue on your end at all. another post on github mentioned

I also had this issue and then another one, and both solved by reinstalling torch from their git. But still this is not enough, I had to clear all packages in my ~/.luarocks before reinstalling. Check also: #164 (comment)

AskAlice avatar Mar 31 '17 20:03 AskAlice

Hi there, you can select a GPU with CUDA_VISIBLE_DEVICES=0, where 0 is your device ID.

TheDeadCode avatar Aug 11 '18 15:08 TheDeadCode