Out Of Memory Error when training any model
I'm getting OOM error when training any model. This is probably not from the batch size, because it happens after a number of iterations.
Script for training:
cd src python train.py mot --exp_id crowdhuman_dla34 --gpus 0 --batch_size 4 --load_model '../models/ctdet_coco_dla_2x.pth' \ --num_epochs 60 --lr_step '50' --data_cfg '../src/lib/cfg/crowdhuman.json' \ --arch=hrnet_18 cd ..
System information: Ubuntu 16.04.6 LTS CUDA Version 10.2.89 Pytorch 1.7.1
I have same problem now. I added torch.cuda.empty_cache(), but results are the same.
does anyone solve this problem?
I also have same problem, it happens after a number of epoch.
I meet the same problem after a number of epoch.Has anyone solved it?