train-CRF-RNN
train-CRF-RNN copied to clipboard
How to reduce the graphics memory occupancy
Hi all My graphics diver is gtx660-2gb, So when I run your example is Wrong Check failed: error == cudaSuccess (2 vs. 0) out of memory I have reduced the size of the image to the training data 256PX,Does not work,Can you tell me how to reduce the graphics memory occupation? Thanks
Hi wuhang,
2 GB aren't really enough to store whole network, therefore you should keep decreasing dimensions of images but I am not sure how much it would have to be resized. To my best knowledge there isn't any other easy way how to exploit network (CRF-RNN) if you don't have enough memory on your GPU.
Cheers,
Martin
@martinkersner thanks,I think I need a Titan X
I got similar error with 4GB GPU, and wonder if it is due to my Caffe installation. I have Caffe bittnt version, Cuda v2, when I turned on GPU, got the following error while CPU mode ran without any problem.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 537968303 I0612 03:49:53.658771 66 upgrade_proto.cpp:620] Attempting to upgrade input file specified using deprecated V1LayerParameter: TVG_CRFRNN_COCO_VOC.caffemodel I0612 03:49:54.213007 66 upgrade_proto.cpp:628] Successfully upgraded file specified using deprecated V1LayerParameter F0612 03:49:54.858664 66 syncedmem.cpp:58] Check failed: error == cudaSuccess (2 vs. 0) out of memory *** Check failure stack trace: *** Aborted
Any help please?
Hi @martinkersner @baoqiangcao guys, I also get the same problem. My GPU is GeForce GTX Titan Black 6Gb, it seems to be not sufficient as well.
i got an error when i run "python solve.py 2>&1 | tee train.log" [libprotobuf ERROR google/protobuf/text_format.cc:274] Error parsing text-format caffe.NetParameter: 5:9: Expected string. F0908 09:40:32.894371 23828 upgrade_proto.cpp:932] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: TVG_CRFRNN_COCO_VOC_TRAIN_3_CLASSES.prototxt why? i just made my caffe last week ,the protobuf should be the newest. thanks for any help!
Hi @windforever118,
You use prototxt file that was made for old version of CRF as RNN and because you didn't post here whole error output I cannot direct you to exact answer. However, you are not the first who is dealing with this problem, so please take a look at other issues and you will find solution for your problem.
Cheers,
Martin
I got same error. I use Titan X. The DB is made from PASCAL VOC 2012 and it is composed of 2630 train images and 283 test images with 20 classes.
I used batch_size as 1 and made solver.prototxt with test_iter = 283, test_interval = 1333 or 2630 or 1000. But, the train process always break at 1 * test_interval time with Check failed: error == cudaSuccess (2 vs. 0) out of memory. How i set test_iter and test_interval?? thank you for any answer!