semantic-segmentation-pytorch icon indicating copy to clipboard operation
semantic-segmentation-pytorch copied to clipboard

RuntimeError: CUDA error: out of memory

Open HLH13297997663 opened this issue 4 years ago • 12 comments

I just use one image to test, but get the error below:

File "test.py", line 203, in main(cfg, args.gpu) File "test.py", line 129, in main test(segmentation_module, loader_test, gpu) File "test.py", line 79, in test scores = scores + pred_tmp / len(cfg.DATASET.imgSizes) RuntimeError: CUDA error: out of memory

HLH13297997663 avatar Jun 02 '20 09:06 HLH13297997663

Hi @HLH13297997663 , is your error solved? Could you please help me in this.

GutlapalliNikhil avatar Oct 20 '20 02:10 GutlapalliNikhil

I have the same problem and cannot manage to solve it. (See error below). Can anyone help with this? Many thanks in advance.

samples: 1

0%| | 0/1 [00:00<?, ?it/s]Traceback (most recent call last): File "test.py", line 216, in main(cfg, args.gpu) File "test.py", line 142, in main res = test(segmentation_module, loader_test, gpu, using_in_memory) File "test.py", line 70, in test scores = async_copy_to(scores, gpu) File "data_parallel.py", line 15, in async_copy_to v = obj.cuda(dev, non_blocking=True) RuntimeError: CUDA out of memory. Tried to allocate 13.41 GiB (GPU 0; 6.00 GiB total capacity; 255.42 MiB already allocated; 4.52 GiB free; 286.00 MiB reserved in total by PyTorch) 0%| | 0/1 [00:10<?, ?it/s]

Process finished with exit code 1

mdanner93 avatar Dec 17 '20 09:12 mdanner93

Hi @mdanner93 ,

In the config yaml files, what is the resolution you mentioned for test?

Try to reduce the resolution and try the same code. It will work.

GutlapalliNikhil avatar Dec 17 '20 11:12 GutlapalliNikhil

HI @GutlapalliNikhil,

thank you for your reply. Thats a good hint definitely will try that out. I wanted to try something similar in the beginning but couldn´t believe that one image will take up that much memory since its getting already resized by going through the pipeline. With "resolution you mentioned for test" in yaml file you are referring to the resolution set for Dataset I guess?

mdanner93 avatar Dec 18 '20 09:12 mdanner93

ok problem is solved, got it to work. Thanks for your help!

mdanner93 avatar Dec 18 '20 09:12 mdanner93

How did you get it to work?

ghost avatar Jan 06 '21 11:01 ghost

@amitk000 , Try to reduce batch size, train resolution and test resolution.

GutlapalliNikhil avatar Jan 06 '21 14:01 GutlapalliNikhil

@HLH13297997663 @GutlapalliNikhil @mdanner93 @amitk000 @quantombone

Hi guys,

I have had the same problem. If you don't want to reduce resolution nor batch size I have downloaded scores and pred_tmp from gpu to cpu before scores = scores + pred_tmp / len(cfg.DATASET.imgSizes). Here is this part of the code:

         with torch.no_grad():
          scores = torch.zeros(1, cfg.DATASET.num_class, segSize[0], segSize[1])
          #scores = async_copy_to(scores, gpu)  # comment this to avoid cuda out of memory

          for img in img_resized_list:
              feed_dict = batch_data.copy()
              feed_dict['img_data'] = img
              del feed_dict['img_ori']
              del feed_dict['info']
              feed_dict = async_copy_to(feed_dict, gpu)

              # forward pass
              pred_tmp = segmentation_module(feed_dict, segSize=segSize)
              
              # -- add this to avoid cuda out of memory
              pred_tmp = pred_tmp.cpu()
              # --
              
              scores = scores + pred_tmp / len(cfg.DATASET.imgSizes)

As the forward pass is done with feed_dict which is still in gpu, I think there won't be speed issues with that.

Regards, AnaVC

avillalbacantero avatar Jan 22 '21 16:01 avillalbacantero

I meet the same problem when test. Use the following code to solve this problem

#!/bin/bash

Image and model names

MODEL_PATH=ade20k-resnet50dilated-ppm_deepsup RESULT_PATH=./111/ #TEST_IMG= ./semantic_test_image/

ENCODER=$MODEL_PATH/encoder_epoch_20.pth DECODER=$MODEL_PATH/decoder_epoch_20.pth

Download model weights and image

if [ ! -e $MODEL_PATH ]; then mkdir $MODEL_PATH fi if [ ! -e $ENCODER ]; then wget -P $MODEL_PATH http://sceneparsing.csail.mit.edu/model/pytorch/$ENCODER fi if [ ! -e $DECODER ]; then wget -P $MODEL_PATH http://sceneparsing.csail.mit.edu/model/pytorch/$DECODER fi if [ ! -e $TEST_IMG ]; then wget -P $RESULT_PATH http://sceneparsing.csail.mit.edu/data/ADEChallengeData2016/images/validation/$TEST_IMG fi

dir=ls ./semantic_test_image/ #?FDIR=./semantic_test_image/ for i in $dir

do echo "-------------------------------------------" echo $FDIR$i python3 -u test.py
--imgs 'semantic_test_image/'$FDIR$i
--cfg config/ade20k-resnet50dilated-ppm_deepsup.yaml
DIR $MODEL_PATH
TEST.result ./lsun_seg/
TEST.checkpoint epoch_20.pth

done

linqinxin-11 avatar Jan 27 '21 15:01 linqinxin-11

I meet the same problem when test. Use the following code to solve this problem

#!/bin/bash

Image and model names

MODEL_PATH=ade20k-resnet50dilated-ppm_deepsup RESULT_PATH=./111/ #TEST_IMG= ./semantic_test_image/

ENCODER=$MODEL_PATH/encoder_epoch_20.pth DECODER=$MODEL_PATH/decoder_epoch_20.pth

Download model weights and image

if [ ! -e $MODEL_PATH ]; then mkdir $MODEL_PATH fi if [ ! -e $ENCODER ]; then wget -P $MODEL_PATH http://sceneparsing.csail.mit.edu/model/pytorch/$ENCODER fi if [ ! -e $DECODER ]; then wget -P $MODEL_PATH http://sceneparsing.csail.mit.edu/model/pytorch/$DECODER fi if [ ! -e $TEST_IMG ]; then wget -P $RESULT_PATH http://sceneparsing.csail.mit.edu/data/ADEChallengeData2016/images/validation/$TEST_IMG fi

dir=ls ./semantic_test_image/ #?FDIR=./semantic_test_image/ for i in $dir

do echo "-------------------------------------------" echo $FDIR$i python3 -u test.py
--imgs 'semantic_test_image/'$FDIR$i
--cfg config/ade20k-resnet50dilated-ppm_deepsup.yaml
DIR $MODEL_PATH
TEST.result ./lsun_seg/
TEST.checkpoint epoch_20.pth

done

linqinxin-11 avatar Jan 27 '21 15:01 linqinxin-11

I meet the same problem when test. Use the following code to solve this problem

#!/bin/bash

Image and model names

MODEL_PATH=ade20k-resnet50dilated-ppm_deepsup RESULT_PATH=./111/ #TEST_IMG= ./semantic_test_image/

ENCODER=$MODEL_PATH/encoder_epoch_20.pth DECODER=$MODEL_PATH/decoder_epoch_20.pth

Download model weights and image

if [ ! -e $MODEL_PATH ]; then mkdir $MODEL_PATH fi if [ ! -e $ENCODER ]; then wget -P $MODEL_PATH http://sceneparsing.csail.mit.edu/model/pytorch/$ENCODER fi if [ ! -e $DECODER ]; then wget -P $MODEL_PATH http://sceneparsing.csail.mit.edu/model/pytorch/$DECODER fi if [ ! -e $TEST_IMG ]; then wget -P $RESULT_PATH http://sceneparsing.csail.mit.edu/data/ADEChallengeData2016/images/validation/$TEST_IMG fi

dir=ls ./semantic_test_image/ #?FDIR=./semantic_test_image/ for i in $dir

do echo "-------------------------------------------" echo $FDIR$i python3 -u test.py
--imgs 'semantic_test_image/'$FDIR$i
--cfg config/ade20k-resnet50dilated-ppm_deepsup.yaml
DIR $MODEL_PATH
TEST.result ./lsun_seg/
TEST.checkpoint epoch_20.pth

done

linqinxin-11 avatar Jan 27 '21 15:01 linqinxin-11

@HLH13297997663 @GutlapalliNikhil @mdanner93 @amitk000 @quantombone

Hi guys,

I have had the same problem. If you don't want to reduce resolution nor batch size I have downloaded scores and pred_tmp from gpu to cpu before scores = scores + pred_tmp / len(cfg.DATASET.imgSizes). Here is this part of the code:

         with torch.no_grad():
          scores = torch.zeros(1, cfg.DATASET.num_class, segSize[0], segSize[1])
          #scores = async_copy_to(scores, gpu)  # comment this to avoid cuda out of memory

          for img in img_resized_list:
              feed_dict = batch_data.copy()
              feed_dict['img_data'] = img
              del feed_dict['img_ori']
              del feed_dict['info']
              feed_dict = async_copy_to(feed_dict, gpu)

              # forward pass
              pred_tmp = segmentation_module(feed_dict, segSize=segSize)
              
              # -- add this to avoid cuda out of memory
              pred_tmp = pred_tmp.cpu()
              # --
              
              scores = scores + pred_tmp / len(cfg.DATASET.imgSizes)

As the forward pass is done with feed_dict which is still in gpu, I think there won't be speed issues with that.

Regards, AnaVC

Thank you @avillalbacantero. Works like a charm!

pranay-ar avatar Mar 16 '22 14:03 pranay-ar