CheXNet
CheXNet copied to clipboard
RuntimeError: CUDA Error: out of memory
Please help me resolve this issue
try smaller batch size
I have 64 batches. and the input size is 256 and the output size 242. By how much I am going to reduce it?
try batch size 8, 16, 32. See if it works
It is still showing me this error:
Traceback (most recent call last):
File "C:\Users\Nasir Isa\Documents\1Research\algortihm\CheXNet-master\CheXNet-master\m3.py", line 149, in
@omrfrkmfy Were you ever able to figure out a solution to the problem? I'm dealing with the same issue
The issue is that your graphics card memory is small. you need to find one with with big memory.
On Sun, May 19, 2019, 07:54 robhyb19 [email protected] wrote:
@omrfrkmfy https://github.com/omrfrkmfy Were you ever able to figure out a solution to the problem? I'm dealing with the same issue
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/arnoweng/CheXNet/issues/27?email_source=notifications&email_token=AJXR52JB3GFSXCLLNZK4UL3PWD2RVA5CNFSM4FWOLZPKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVW4CMQ#issuecomment-493732146, or mute the thread https://github.com/notifications/unsubscribe-auth/AJXR52KN7W72U3GLRND5FK3PWD2RVANCNFSM4FWOLZPA .
With 4 worker cores of NVIDIA P100, I had to gave 12 batch size. But, the AUROC is 49%, may be due to small batch size
Maybe you can try this idea https://blog.csdn.net/xijuezhu8128/article/details/86594478
I have encountered the same issue and solved it by forcing no gradient when using model.eval()
with torch.no_grad():
for i, (data,label) in enumerate(test_loader):
...
(Remember to use tab)
This makes the model do not save intermediate results so that temporary memory use will be freed after each batch.
May it helps.
With 4 worker cores of NVIDIA P100, I had to gave 12 batch size. But, the AUROC is 49%, may be due to small batch size
I am dealing with the same issue and when I try multiple times it achieves different results. Did you solve it or find why?
Hello, I know this is so late and it seems like the owner does not continue maintain the code for years. Yet if some of you end up with this problem and somehow, run into this issue. Try my solution: https://github.com/arnoweng/CheXNet/pull/39. I just started learning pytorch
today and I'm not a pytorch pro. It is possible that my changes would lead to logic flaws. If that the case please tell me :)