SOLO icon indicating copy to clipboard operation
SOLO copied to clipboard

The SOLO framework seems to eat a lot of GPU memory

Open marearth opened this issue 3 years ago • 5 comments

when I try "python tools/train.py configs/solo/solo_r50_fpn_8gpu_1x.py" command and reduce image per gpu and worker per gpu to 1 at the same time, I still get error message "CUDA out of memory" for one 10.92 GB 1080Ti GPU. It's favored that memory consumption of gpu is provided for different models before others try to repeat the experiments. Could you give some references about gpu memory consumption?

marearth avatar Mar 27 '21 07:03 marearth

I've met the same problem:

[>>>>>>>>>>>>> ] 20/76, 0.3 task/s, elapsed: 59s, ETA: 165s Traceback (most recent call last): ... RuntimeError: CUDA out of memory. Tried to allocate 3.30 GiB (GPU 0; 8.00 GiB total capacity; 973.14 MiB already allocated; 2.13 GiB free; 3.74 GiB reserved in total by PyTorch)

Then I shrinked my test set to 14 images, same error occurred when [>> ] 2/14.

What should I do with that?

zhuaiyi avatar Jun 22 '21 08:06 zhuaiyi

I solved the problem by using bigger GPU capacity. Minimum capacity of GPU per core seems to be 16GB

marearth avatar Jun 22 '21 08:06 marearth

I solved the problem by using bigger GPU capacity. Minimum capacity of GPU per core seems to be 16GB

Do you mean that you changed a better GPU? That seems too expensive for me...

zhuaiyi avatar Jun 22 '21 09:06 zhuaiyi

@marearth SOLOv2 is much more GPU memory efficient. Please try SOLOv2 instead.

WXinlong avatar Jun 22 '21 09:06 WXinlong

@WXinlong for SOLOv2 with R50_3x also i am getting same issue. RuntimeError: CUDA out of memory. Tried to allocate 1.00 GiB (GPU 0; 11.17 GiB total capacity; 9.88 GiB already allocated; 503.81 MiB free; 10.23 GiB reserved in total by PyTorch)

avinash-asink avatar Sep 02 '21 17:09 avinash-asink