3D-R2N2
3D-R2N2 copied to clipboard
Out of memory? GTX 780M - 17.04
Hey,
Trying to train you cool project with new images.
So I tried the example to train, but got these errors.
Error allocating 33554432 bytes of device memory (out of memory). Driver report 17367040 bytes free and 4231200768 bytes total
Wait until the dataprocesses to end
Signal processes
Traceback (most recent call last):
File "/home/quinten/Documents/3D-R2N2/lib/train_net.py", line 21, in func_wrapper
return func(*args, **kwargs)
File "/home/quinten/Documents/3D-R2N2/lib/train_net.py", line 38, in train_net
net = NetClass()
File "/home/quinten/Documents/3D-R2N2/models/net.py", line 37, in __init__
self.setup()
File "/home/quinten/Documents/3D-R2N2/models/net.py", line 40, in setup
self.network_definition()
File "/home/quinten/Documents/3D-R2N2/models/res_gru_net.py", line 70, in network_definition
t_x_s_update = FCConv3DLayer(prev_s, fc7, (n_deconvfilter[0], n_deconvfilter[0], 3, 3, 3))
File "/home/quinten/Documents/3D-R2N2/lib/layers.py", line 478, in __init__
fan_out=self._output_shape[2])
File "/home/quinten/Documents/3D-R2N2/lib/layers.py", line 74, in __init__
self.val = theano.shared(value=self.np_values)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/compile/sharedvalue.py", line 268, in shared
allow_downcast=allow_downcast, **kwargs)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/sandbox/cuda/var.py", line 188, in float32_shared_constructor
deviceval = type_support_filter(value, type.broadcastable, False, None)
MemoryError: ('Error allocating 33554432 bytes of device memory (out of memory).', "you might consider using 'theano.shared(..., borrow=True)'")
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 118, in <module>
main()
File "main.py", line 109, in main
train_net()
File "/home/quinten/Documents/3D-R2N2/lib/train_net.py", line 24, in func_wrapper
kill_processes(train_queue, train_processes)
File "/home/quinten/Documents/3D-R2N2/lib/data_process.py", line 178, in kill_processes
for p in processes:
TypeError: 'NoneType' object is not iterable
[INFO/MainProcess] process shutting down
I got a GTX 780M shouldn't this be enough to train the samples?
More of the same.
/home/quinten/Documents/3D-R2N2/lib/layers.py:354: UserWarning: DEPRECATION: the 'ds' parameter is not going to exist anymore as it is going to be replaced by the parameter 'ws'.
padding=self._padding)
/home/quinten/Documents/3D-R2N2/lib/layers.py:354: UserWarning: DEPRECATION: the 'padding' parameter is not going to exist anymore as it is going to be replaced by the parameter 'pad'.
padding=self._padding)
lib/data_io.py: model paths from ./experiments/dataset/shapenet_1000.json
[INFO/ReconstructionDataProcess-1] child process calling self.run()
lib/data_io.py: model paths from ./experiments/dataset/shapenet_1000.json
Set the learning rate to 0.000100.
[INFO/ReconstructionDataProcess-2] child process calling self.run()
Compiling training function
2017-11-19 10:48:51.435496 Iter: 0 Loss: 0.407328
Compiling testing function
Problem occurred during compilation with the command line below:
/usr/bin/g++ -shared -g -O3 -fno-math-errno -Wno-unused-label -Wno-unused-variable -Wno-write-strings -march=haswell -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf -mmovbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mno-lwp -mfma -mno-fma4 -mno-xop -mbmi -mbmi2 -mno-tbm -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mrdrnd -mf16c -mfsgsbase -mno-rdseed -mno-prfchw -mno-adx -mfxsr -mxsave -mxsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 -mno-clflushopt -mno-xsavec -mno-xsaves -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-clwb -mno-mwaitx -mno-clzero -mno-pku --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=8192 -mtune=haswell -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -fPIC -I/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/numpy/core/include -I/usr/include/python3.5m -I/home/quinten/Documents/3D-R2N2/py3/include/python3.5m -I/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/gof -L/usr/lib -fvisibility=hidden -o /home/quinten/.theano/compiledir_Linux-4.10--generic-x86_64-with-Ubuntu-17.04-zesty-x86_64-3.5.3-64/tmpvcuds4gb/m9124b60ae7623786b4d02a7f8ac06738.so /home/quinten/.theano/compiledir_Linux-4.10--generic-x86_64-with-Ubuntu-17.04-zesty-x86_64-3.5.3-64/tmpvcuds4gb/mod.cpp -lpython3.5m
ERROR (theano.gof.cmodule): [Errno 12] Cannot allocate memory
Wait until the dataprocesses to end
Signal processes
Empty queue
kill processes
Signal processes
Empty queue
kill processes
Traceback (most recent call last):
File "main.py", line 118, in <module>
main()
File "main.py", line 109, in main
train_net()
File "/home/quinten/Documents/3D-R2N2/lib/train_net.py", line 21, in func_wrapper
return func(*args, **kwargs)
File "/home/quinten/Documents/3D-R2N2/lib/train_net.py", line 71, in train_net
solver.train(train_queue, val_queue)
File "/home/quinten/Documents/3D-R2N2/lib/solver.py", line 170, in train
_, val_loss, _ = self.test_output(batch_img, batch_voxel)
File "/home/quinten/Documents/3D-R2N2/lib/solver.py", line 218, in test_output
*self.net.activations])
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/compile/function.py", line 326, in function
output_keys=output_keys)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/compile/pfunc.py", line 486, in pfunc
output_keys=output_keys)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/compile/function_module.py", line 1795, in orig_function
defaults)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/compile/function_module.py", line 1661, in create
input_storage=input_storage_lists, storage_map=storage_map)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/gof/link.py", line 699, in make_thunk
storage_map=storage_map)[:3]
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/gof/vm.py", line 1047, in make_all
impl=impl))
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/gof/op.py", line 935, in make_thunk
no_recycling)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/gof/op.py", line 839, in make_c_thunk
output_storage=node_output_storage)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/gof/cc.py", line 1190, in make_thunk
keep_lock=keep_lock)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/gof/cc.py", line 1131, in __compile__
keep_lock=keep_lock)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/gof/cc.py", line 1586, in cthunk_factory
key=key, lnk=self, keep_lock=keep_lock)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/gof/cmodule.py", line 1159, in module_from_key
module = lnk.compile_cmodule(location)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/gof/cc.py", line 1489, in compile_cmodule
preargs=preargs)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/gof/cmodule.py", line 2294, in compile_str
p_out = output_subprocess_Popen(cmd)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/misc/windows.py", line 77, in output_subprocess_Popen
p = subprocess_Popen(command, **params)
File "/home/quinten/Documents/3D-R2N2/py3/lib/python3.5/site-packages/theano/misc/windows.py", line 43, in subprocess_Popen
proc = subprocess.Popen(command, startupinfo=startupinfo, **params)
File "/usr/lib/python3.5/subprocess.py", line 676, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.5/subprocess.py", line 1221, in _execute_child
restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
[INFO/MainProcess] process shutting down
I've changed the theanoflags for cuda and shutted down al order software (google chrome) to run it and now I'm getting this.
Compiling testing function
2017-11-19 12:03:16.399917 Test loss: 0.665769
param 0 : 0.114151
param 1 : 0.100100
param 2 : 0.222605
param 3 : 0.100100
param 4 : 0.199780
param 5 : 0.100100
param 6 : 0.182799
param 7 : 0.100100
param 8 : 0.579801
param 9 : 0.100100
param 10 : 0.156189
param 11 : 0.100100
param 12 : 0.134776
param 13 : 0.100100
param 14 : 0.475140
param 15 : 0.100100
param 16 : 0.142268
param 17 : 0.100100
param 18 : 0.136335
param 19 : 0.100100
param 20 : 0.132680
param 21 : 0.100100
param 22 : 0.136117
param 23 : 0.100100
param 24 : 0.371115
param 25 : 0.100100
param 26 : 0.139044
param 27 : 0.100100
param 28 : 0.146845
param 29 : 0.100100
param 30 : 0.166665
param 31 : 0.100100
param 32 : 0.110215
param 33 : 0.110708
param 34 : 0.100100
param 35 : 0.117502
param 36 : 0.111097
param 37 : 0.100000
param 38 : 0.111521
param 39 : 0.108941
param 40 : 0.100100
param 41 : 0.112705
param 42 : 0.100100
param 43 : 0.121206
param 44 : 0.100100
param 45 : 0.109536
param 46 : 0.100100
param 47 : 0.120517
param 48 : 0.100100
param 49 : 0.124972
param 50 : 0.100100
param 51 : 0.159994
param 52 : 0.100100
param 53 : 0.730088
param 54 : 0.100100
param 55 : 0.170752
param 56 : 0.100100
param 57 : 0.202829
param 58 : 0.100100
param 59 : 0.194471
param 60 : 0.100100
param 61 : 0.231402
param 62 : 0.100100
Wait until the dataprocesses to end
Signal processes
Empty queue
kill processes
Signal processes
Empty queue
Which I guess that it is working. Though it now keeps 'stuck' at empty queue, but maybe this just takes a long while.
Hi, I also got this problem. Can you tell me how did you sole this issue?