caffe-segnet
caffe-segnet copied to clipboard
Out of memory error !!
I am trying to train camseq databse using segnet. I was able to prepare the dataset. But when I run training, I am getting Out of memory error. I am using nvidia gtx960 4 GB. I have even reduced batch size to 1 in both train and test, but I'm still getting the error
check failed: error == cudaSuccess (2 vs. 0) out of memory.
Have you followed the tutorial? I think you should be able to train with 4GB of memory for both segnet and segnet-basic: http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html
@hoticevijay Try to down-sample the input ...
I'm trying to run ./caffe-segnet/build/tools/caffe train -gpu 0 -solver SegNet-Tutorial/Models/segnet_solver.prototxt
on GeForce GTX 460.
I reduced batch_size
to 1.
Seems it's the same error, is it because out of memory?
I0225 23:41:36.637146 11302 net.cpp:247] Network initialization done.
I0225 23:41:36.637159 11302 net.cpp:248] Memory required for data: 1083110452
I0225 23:41:36.637718 11302 solver.cpp:42] Solver scaffolding done.
I0225 23:41:36.637940 11302 solver.cpp:250] Solving VGG_ILSVRC_16_layer
I0225 23:41:36.637953 11302 solver.cpp:251] Learning Rate Policy: step
F0225 23:41:36.745960 11302 syncedmem.cpp:51] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
@ 0x7f3281712daa (unknown)
@ 0x7f3281712ce4 (unknown)
@ 0x7f32817126e6 (unknown)
@ 0x7f3281715687 (unknown)
@ 0x7f3281b827db caffe::SyncedMemory::mutable_gpu_data()
@ 0x7f3281b6a542 caffe::Blob<>::mutable_gpu_data()
@ 0x7f3281b91364 caffe::ConvolutionLayer<>::Forward_gpu()
@ 0x7f3281a72279 caffe::Net<>::ForwardFromTo()
@ 0x7f3281a726a7 caffe::Net<>::ForwardPrefilled()
@ 0x7f3281b45a55 caffe::Solver<>::Step()
@ 0x7f3281b4638f caffe::Solver<>::Solve()
@ 0x406676 train()
@ 0x404bb1 main
@ 0x7f3280c24ec5 (unknown)
@ 0x40515d (unknown)
@ (nil) (unknown)
Aborted (core dumped)
It's strange but even in CPU mode it fails:
I0225 23:58:24.054597 12022 net.cpp:247] Network initialization done.
I0225 23:58:24.054610 12022 net.cpp:248] Memory required for data: 1083110452
I0225 23:58:24.055112 12022 solver.cpp:42] Solver scaffolding done.
I0225 23:58:24.055332 12022 solver.cpp:250] Solving VGG_ILSVRC_16_layer
I0225 23:58:24.055346 12022 solver.cpp:251] Learning Rate Policy: step
F0225 23:59:00.052022 12022 upsample_layer.cpp:127] upsample top index 0 out of range - check scale settings match input pooling layer's downsample setup
*** Check failure stack trace: ***
@ 0x7f7a1156cdaa (unknown)
@ 0x7f7a1156cce4 (unknown)
@ 0x7f7a1156c6e6 (unknown)
@ 0x7f7a1156f687 (unknown)
@ 0x7f7a11984169 caffe::UpsampleLayer<>::Backward_cpu()
@ 0x7f7a118cb67d caffe::Net<>::BackwardFromTo()
@ 0x7f7a118cb821 caffe::Net<>::Backward()
@ 0x7f7a1199fa5d caffe::Solver<>::Step()
@ 0x7f7a119a038f caffe::Solver<>::Solve()
@ 0x406676 train()
@ 0x404bb1 main
@ 0x7f7a10a7eec5 (unknown)
@ 0x40515d (unknown)
@ (nil) (unknown)
Aborted (core dumped)
Have you changed anything from the tutorial? Also are you able to test SegNet?
I change only batch size to reduce memory usage as tutorial suggest.
Where I can find already trained model to test SegNet?
Also I tried to downsample images to 240x180, but it seems that it's not so easy.
I0227 11:32:34.148586 13517 net.cpp:90] Creating Layer upsample4
I0227 11:32:34.148597 13517 net.cpp:410] upsample4 <- conv5_1_D
I0227 11:32:34.148608 13517 net.cpp:410] upsample4 <- pool4_mask
I0227 11:32:34.148624 13517 net.cpp:368] upsample4 -> pool4_D
I0227 11:32:34.148638 13517 net.cpp:120] Setting up upsample4
F0227 11:32:34.148663 13517 upsample_layer.cpp:63] Check failed: bottom[0]->height() == bottom[1]->height() (23 vs. 12)
*** Check failure stack trace: ***
@ 0x7f6774aeadaa (unknown)
@ 0x7f6774aeace4 (unknown)
@ 0x7f6774aea6e6 (unknown)
@ 0x7f6774aed687 (unknown)
@ 0x7f6774f01a88 caffe::UpsampleLayer<>::Reshape()
@ 0x7f6774e54502 caffe::Net<>::Init()
@ 0x7f6774e56262 caffe::Net<>::Net()
@ 0x7f6774f19f00 caffe::Solver<>::InitTrainNet()
@ 0x7f6774f1aed3 caffe::Solver<>::Init()
@ 0x7f6774f1b0a6 caffe::Solver<>::Solver()
@ 0x40c5d0 caffe::GetSolver<>()
@ 0x406611 train()
@ 0x404bb1 main
@ 0x7f6773ffcec5 (unknown)
@ 0x40515d (unknown)
@ (nil) (unknown)
Seems it's related to https://github.com/alexgkendall/caffe-segnet/issues/10
@mrgloom
You will need to change upsample_h, in certain layers .. the error above say that upsample_h at upsample4 layer should be 23.
layer { name: "upsample4" type: "Upsample" bottom: "pool4" bottom: "pool4_mask" top: "upsample4" upsample_param { upsample_h: 23 }
I tried to change
layer {
name: "upsample4"
type: "Upsample"
bottom: "conv5_1_D"
top: "pool4_D"
bottom: "pool4_mask"
upsample_param {
scale: 2
upsample_w: 30#60 #depends on image input size
upsample_h: 23#45
}
}
but still have the same error(always 23 vs. 12
):
F0303 00:56:11.244488 3618 upsample_layer.cpp:63] Check failed: bottom[0]->height() == bottom[1]->height() (23 vs. 12)
Also my question is why some upsample layers have upsample_w
, upsample_h
and some just have scale=2
?
Seems I have understand that I need modify all upsample layers and specify upsample_w
, upsample_h
directly.
here is my segnet_solver.prototxt
https://gist.github.com/mrgloom/f0972272938adfc44163
./caffe-segnet/build/tools/caffe train -gpu 0 -solver SegNet-Tutorial/Models/segnet_solver.prototxt
but even with reduces image size it takes too much memory: http://pastebin.com/dkJTgXQu
also when I try to train my model on CPU
./caffe-segnet/build/tools/caffe train -solver SegNet-Tutorial/Models/segnet_solver.prototxt
http://pastebin.com/ZAANMbfp
I get the same error
I0303 01:19:28.473911 3856 net.cpp:247] Network initialization done.
I0303 01:19:28.473920 3856 net.cpp:248] Memory required for data: 271680820
I0303 01:19:28.474397 3856 solver.cpp:42] Solver scaffolding done.
I0303 01:19:28.474613 3856 solver.cpp:250] Solving VGG_ILSVRC_16_layer
I0303 01:19:28.474623 3856 solver.cpp:251] Learning Rate Policy: step
F0303 01:19:34.556548 3856 upsample_layer.cpp:127] upsample top index 0 out of range - check scale settings match input pooling layer's downsample setup
Seems as here : https://github.com/alexgkendall/caffe-segnet/issues/5
At least I successfully run pretrained model segnet_basic_camvid.caffemodel from model zoo, but only in CPU mode.
What I don't understand why it consumes about ~5000Mb RAM, while in log caffe say that Memory required for data: 410930228
which as I understand about ~400Mb?
export PATH=$PATH:/home/myuser/Downloads/SegNet/caffe-segnet/build/tools
export PYTHONPATH=/home/myuser/Downloads/SegNet/caffe-segnet/python:$PYTHONPATH
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
python /home/myuser/Downloads/SegNet/SegNet-Tutorial/Scripts/compute_bn_statistics.py /home/myuser/Downloads/SegNet/SegNet-Tutorial/Models/segnet_basic_train.prototxt /home/myuser/Downloads/SegNet/SegNet-Tutorial/Models/Training/segnet_basic_camvid.caffemodel /home/myuser/Downloads/SegNet/Models/Inference/
batch_size: 1
Memory required for data: 410930228
htop RES column ~4230Mb
batch_size: 2
Memory required for data: 821856308
htop RES column ~4683Mb
batch_size: 4
Memory required for data: 1643708468
htop RES column ~5598Mb
batch_size: 8
Memory required for data: 3287412788
htop RES column ~6597Mb (after 1st iteration 7261Mb)
Hi,
It is not a secret that Caffe generally not efficient in memory usage. It is a pay for speed. cuDNN could slightly improve the situation.
Best
Can you eleborate on "Caffe generally not efficient in memory usage" is it related to convolution implementation http://caffe.berkeleyvision.org/tutorial/convolution.html ?
@mrgloom:
is it related to convolution implementation
Seems yes.
Hi @mrgloom I also have the Check failed: bottom[0]->height() == bottom[1]->height() (23 vs. 16) error, and I used your suggested segnet_solver.prototxt https://gist.github.com/mrgloom/f0972272938adfc44163 But same error is occurring. Any help is appreciated. Thanks!
I got the exact same error:
23:41:36.637146 11302 net.cpp:247] Network initialization done.
I0225 23:41:36.637159 11302 net.cpp:248] Memory required for data: 1083110452
I0225 23:41:36.637718 11302 solver.cpp:42] Solver scaffolding done.
I0225 23:41:36.637940 11302 solver.cpp:250] Solving VGG_ILSVRC_16_layer
I0225 23:41:36.637953 11302 solver.cpp:251] Learning Rate Policy: step
F0225 23:41:36.745960 11302 syncedmem.cpp:51] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
@ 0x7f3281712daa (unknown)
@ 0x7f3281712ce4 (unknown)
@ 0x7f32817126e6 (unknown)
@ 0x7f3281715687 (unknown)
@ 0x7f3281b827db caffe::SyncedMemory::mutable_gpu_data()
@ 0x7f3281b6a542 caffe::Blob<>::mutable_gpu_data()
@ 0x7f3281b91364 caffe::ConvolutionLayer<>::Forward_gpu()
@ 0x7f3281a72279 caffe::Net<>::ForwardFromTo()
@ 0x7f3281a726a7 caffe::Net<>::ForwardPrefilled()
@ 0x7f3281b45a55 caffe::Solver<>::Step()
@ 0x7f3281b4638f caffe::Solver<>::Solve()
@ 0x406676 train()
@ 0x404bb1 main
@ 0x7f3280c24ec5 (unknown)
@ 0x40515d (unknown)
@ (nil) (unknown)
Aborted (core dumped)
The solution was to install cudnn v2. And enable it in the make.config file.
Running on a GTX 980 ti 6GB, ubuntu 14.04, and cuda version 7.5
the solution of @Seanberite works. I was running CUDA 8 without cuDNN. This would give the out of memory issue. I downgraded to CUDA 7.5 and cuDNN v2. I can now train the CamVid demo without the memory error on a GTX 960 (4GB, batch size 1).
@mrgloom How did you resolve "upsample top index 0 out of range - check scale settings match input pooling layer's downsample setup" ? Thanks
Hi, in the beginning i got the exact same error reported here for the lack of memory, so i followed the advices i found here and rebuilted caffe-segnet with cudnn = 1 but now i get this error :
Setting up conv1_1 F0314 06:59:55.986902 10381 cudnn_conv_layer.cpp:53] Check failed: status == CUDNN_STATUS_SUCCESS (4 vs. 0) CUDNN_STATUS_INTERNAL_ERROR
Any idea ?
P.S : I tried with cudnn v5 and cuda 8.0 (i needed to use elements related to cudnn from a newer caffe repositery) I downgraded cudnn to v2 and get this :
I0314 08:21:15.397732 19896 net.cpp:247] Network initialization done. I0314 08:21:15.397734 19896 net.cpp:248] Memory required for data: 555925776 I0314 08:21:15.397902 19896 solver.cpp:42] Solver scaffolding done. I0314 08:21:15.398056 19896 solver.cpp:250] Solving VGG_ILSVRC_16_layer I0314 08:21:15.398059 19896 solver.cpp:251] Learning Rate Policy: step F0314 08:21:16.202524 19896 math_functions.cu:123] Check failed: status == CUBLAS_STATUS_SUCCESS (11 vs. 0) CUBLAS_STATUS_MAPPING_ERROR
I'm still with CUDA 8.0 though
Is there an actual fix for this issue ? I'm using Ubuntu 16.04 / nvidia gtx1050 / Cuda 8.0 / No cudnn
hi all, need help !! i get the same error with nvidia gtx 960m 4Gb of memory. i use, cuda 8.0, the batch_size = 1. when i test the webcam_demo.py with gpu, i got this error
I1014 19:09:01.359390 5819 net.cpp:247] Network initialization done. I1014 19:09:01.359392 5819 net.cpp:248] Memory required for data: 1065139200 (' Grabbed camera frame in ', '549.619197845', 'ms') (' Resized image in ', '66.5330886841', 'ms') F1014 19:09:03.429714 5819 syncedmem.cpp:51] Check failed: error == cudaSuccess (2 vs. 0) out of memory *** Check failure stack trace: *** Abandon (core dumped)
could you help me please ?
I would recommand using Ubuntu14 as a fix if anyone faces this issue (tell me if it worked better). Otherwise, you can use ENet segmentation.