Yet-Another-EfficientDet-Pytorch
Yet-Another-EfficientDet-Pytorch copied to clipboard
inference GPU memory weird.
Hi, I have a problem with gpu memory increase
I ran your test code(inference) and gpu memory increased 8GB. (when D3) I can't use your bigger model because of this situation.
I found which statement occur this situation

on your model.py -> EfficientNet(nn.Module)
the variable x hold the gpu memory.
and this variable size goes up in the 'for loop'
// the statement : x = block(x, drop_connect_rate = drop_connect_rate)
(maybe "stacking x" makes holded gpu memory size bigger)
"torch.cuda.empty_cache()" is can't clear gpu memory because the variable hold the memory.
this is the cmd print results .

I am suffered very long time because of this situation.
MY env
torch == 1.4.0 torch_vision == 0.5.0 python == 3.6 CUDA 10.2 with cudnn
maybe I can't change my python/ CUDA version because of my co-work. but the other envs are matched with the env which you wrote in this GitHub.
Plz. I need your help
It works fine for me. I'm now using torch 1.8.1+cu111. At very least, I can infer with D7. I'm guessing it's a bug of pytorch or cuda.
0 N/A N/A 428708 C /usr/bin/python3.8 4153MiB
ohh... I catch it. your opinion is right!..
my env was
torch == 1.4.1 torch vision == 0.5.0
when upgrade env to
torch == 1.8.1 torch vision = 0.9.1
that situation never occur..!!
I think you have to fix your README.
thank you for your fast reply. I have never seen polite writer like you in GitHub. Thanks!
I have two more question.
Q1.
when i run d1 model train . batch_size=4 , with 4 GPU //
the gpu memory is 3200mb

gpu 4 , batchsize 4 -> one batch for 1 gpu. and, I can't train D6 model in batch:4 with RTX TITAN X 4
Is it normal memory usage in training??
Q2. when i upgrade my torch to 1.8.1 I got a message in Val.Epoche // I'd never get an message in this phase, when my torch ver. is 1.4.1
[W accumulate_grad.h:184] Warning: grad and param do not obey the gradient layout contract. This is not an error, but may impair performance.

is it critical bug..?
It works fine for me. I'm now using torch 1.8.1+cu111. At very least, I can infer with D7. I'm guessing it's a bug of pytorch or cuda.
0 N/A N/A 428708 C /usr/bin/python3.8 4153MiB
It works fine for me. I'm now using torch 1.8.1+cu111. At very least, I can infer with D7. I'm guessing it's a bug of pytorch or cuda.
0 N/A N/A 428708 C /usr/bin/python3.8 4153MiB
I have the same env, but I can't get the 32 FPS when I use the efficient_test.py, I just get 16pfs, but I don't know the reason