pix2pix
pix2pix copied to clipboard
THCudaCheck FAIL : out of memory on train.lua and test.sh
New iMAC with new fresh install and having some problems installing pixtopix / torch (had to downgrade the CLT to enable running install-deps and other commands) and just when I think its going to work on a train command I get this error which i'm sure is GPU related and it appears in the ./test.sh output as well
THCudaCheck FAIL file=/Users/ashleyjamesbrown/torch/extra/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
stack traceback:
[C]: in function 'resize'
...shleyjamesbrown/torch/install/share/lua/5.1/nn/utils.lua:11: in function 'torch_Storage_type'
...shleyjamesbrown/torch/install/share/lua/5.1/nn/utils.lua:57: in function 'recursiveType'
...hleyjamesbrown/torch/install/share/lua/5.1/nn/Module.lua:160: in function 'type'
...mesbrown/torch/install/share/lua/5.1/nngraph/gmodule.lua:258: in function 'cuda'
train.lua:190: in main chunk
[C]: in function 'dofile'
...rown/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x0106db9330
Im trying to train 10 images on the GPU so i doubt its actually running out of memory ?
Have recently tried a clean and update in the torch directory.
Installed CUDA and succesfully did make on a few of the samples and ran them succesfully so i'm sure that cuda itself is installed ok.
System: Mac OSX 10.12.5 3.1 GHz Core i7 Geforce GT750m 1024
Xcode 8.3.3 CLT installed and switched to 8.2
CUDA 8.0.83 GPU Driver Version: 10.17.5 355.10.05.45f01
Installed cuDNN 5.1 for CUDA 8 (cuDNN 6 is on machine but it didn't work linking so its in a backup folder)
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2016 NVIDIA Corporation Built on Tue_Jan_10_13:22:46_CST_2017 Cuda compilation tools, release 8.0, V8.0.61
149 0 0xffffff7f83889000 0x2000 0x2000 com.nvidia.CUDA (1.1.0) DD792765-CA28-395A-8593-D6837F05C4FF <4 1>
I've been through a lot of google searching and trying various things but its not coming up trumps.
If this should be in the torch issues instead then let me know and ill remove / move
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GT 750M"
CUDA Driver Version / Runtime Version 8.0 / 8.0
CUDA Capability Major/Minor version number: 3.0
Total amount of global memory: 1024 MBytes (1073283072 bytes)
( 2) Multiprocessors, (192) CUDA Cores/MP: 384 CUDA Cores
GPU Max Clock rate: 926 MHz (0.93 GHz)
Memory Clock rate: 2508 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 262144 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GT 750M
Result = PASS
[CUDA Bandwidth Test] - Starting...
Running on...
Device 0: GeForce GT 750M
Quick Mode
Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 4997.3
Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 9886.6
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 45863.0
Result = PASS
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
I believe that 1GB GPU memory might not be enough for training the model. You can try to train a model on low-resolution images (e.g. loadSize=143, fineSize=128
.)
@junyanz When running ./test.sh from the torch directory though I get similar errors thrown up on certain models that it tries to test ? Yeh the cuda examples all run fine so is torch just needing more gpu then ?
I'd be pretty sad if my new machine i just bought with an nividia gpu wasnt good enough. I already had a mac running cpu commands but taking days so i purchased another and its no better.
I tried your comment but this also failed, even setting size as low as 10
Wonder if its because i have cuda 5.1 but if i change up to cuda 6 then I get binding errors ?
If i change to use cuda 6 then i get no errors in ./test.sh but i do get bindings error instead.
Found Environment variable CUDNN_PATH = /Users/ashleyjamesbrown/cuda6/lib/libcudnn.6.dylib/Users/ashleyjamesbrown/torch/install/bin/luajit: ...hleyjamesbrown/torch/install/share/lua/5.1/cudnn/ffi.lua:1618: These bindings are for CUDNN 5.x (5005 <= cudnn.version > 6000) , while the loaded CuDNN is version: 6021
Are you using an older or newer version of CuDNN?
stack traceback:
[C]: in function 'error'
...hleyjamesbrown/torch/install/share/lua/5.1/cudnn/ffi.lua:1618: in main chunk
[C]: in function 'require'
...leyjamesbrown/torch/install/share/lua/5.1/cudnn/init.lua:4: in main chunk
[C]: at 0x0101aaae10
[C]: at 0x0101a2e330
Thought i would update. Fixed the binding errors with the torch fix from soumith with the v6 branch Still get gpu errors running the test.sh script inside torch and yet i can compile cuda examples and run them fine. Cannot run gpu for training. I ran a geekbench check and it came up with 2 gpu - intel iris pro and nividia geforce so i tried altering the gpu=1 lien in case it was trying to select the iris but this gave me an error that it didnt have a gpu
In the end i stayed with cpu train which took longer longer longer. Ill come back to trying the gpu again in the future. And i just saw cuda 9 release so as with this stuff - things move etc and ill try again in a month or so.