Fewshot_Detection out of memory for fine tuning

I am reproducing the result using the instruction provided in the README file.

I was able to train the base model and obtain AP of 0.6862, which matches what the paper reports. However, when I tried to run the fine-tuning command, the process exits with an out of memory error for the backward pass.

I am training with 4 GeForce GTX 1080 Ti with roughly 12Gb of memory. Did you use GPUs with more memory or is something weird happening?

Dec 05 '19 04:12 quanvuong

adding del loss and torch.cuda.empty_cache() solve this problem

Dec 05 '19 04:12 quanvuong

Actually, using empty_cached() leads to really slow GPU operations (60 hours for the fine tuning step). Is there another work around?

If I simply do del loss without emptying the cache, the out of memory error still happens.

Dec 05 '19 05:12 quanvuong

torch.cuda.empty_cache()

hi @quanvuong would you mind elaborating where to add these, much appreciated!

Dec 07 '19 08:12 shenglih

you can add it after loss.backward()

Dec 07 '19 22:12 quanvuong

Hi @quanvuong,

Had you solved this problem ? I got one similar when evaluating the baseline model, which caused CUDA error: out of memory due to accumulate the data from each iter. I used torch 0.4.1 version. Already try to emty_cache() both del metax, mask but it doesn't help.

Feb 28 '20 06:02 thsunkid

Hi @quanvuong,

Had you solved this problem ? I got one similar when evaluating the baseline model, which caused CUDA error: out of memory due to accumulate the data from each iter. I used torch 0.4.1 version. Already try to emty_cache() both del metax, mask but it doesn't help.

In my cases, I used torch v0.4.1 instead of v0.3.1 like the author used. I solved my problem by adding with torch.no_grad() during validation because volatile variable in Variable class no longer clear the gradient value, causing accumulated memory in GPU.

Mar 02 '20 05:03 thsunkid

Based on my understanding, there are two reasons for the out-of-memory during tuning

during the tuning phase, 20 class instead of 15 classes are fed into the re-weighting net. which causes more GPU memory usage.
During the tuning phase, for multi-scale training, the input images can be as large as 600+. which leads to dynamic memory usage.

The solution could be 1. decrease the batch size a little bit 2. resize the input image size carefully

Jul 30 '20 21:07 Fangyi-Chen