Face_Attention_Network icon indicating copy to clipboard operation
Face_Attention_Network copied to clipboard

out of memory when running

Open zhenyezi opened this issue 6 years ago • 3 comments

hello, when I running this procedure, it will appear out of memory when run halfway, so I want to know the reason, thank you!

zhenyezi avatar Oct 24 '18 13:10 zhenyezi

If images in your dataset do not have the same size, the memory needed is always changing. You can reduce the input size in dataloader.py or use smaller batchsize.

rainofmine avatar Oct 31 '18 14:10 rainofmine

I also meet this issue. The gpu memory will increase during training.

nihaoxiaoli avatar Dec 28 '18 13:12 nihaoxiaoli

Method calc_iou in losses.py sometimes creates HUGE tensors. You have to compute tensors partially:

def calc_iou(a, b):
    step = 20
    IoU = torch.zeros((len(a), len(b))).cuda()
    step_count = int(len(b) / step)
    if len(b) % step != 0:
        step_count += 1

    area = (b[:, 2] - b[:, 0]) * (b[:, 3] - b[:, 1])

    for i in range(step_count):
        iw = torch.min(torch.unsqueeze(a[:, 2], dim=1), b[i * step:(i + 1) * step, 2])
        iw.sub_(torch.max(torch.unsqueeze(a[:, 0], 1), b[i * step:(i + 1) * step, 0]))

        ih = torch.min(torch.unsqueeze(a[:, 3], dim=1), b[i * step:(i + 1) * step, 3])
        ih.sub_(torch.max(torch.unsqueeze(a[:, 1], 1), b[i * step:(i + 1) * step, 1]))

        iw.clamp_(min=0)
        ih.clamp_(min=0)

        iw.mul_(ih)
        del ih

        ua = torch.unsqueeze((a[:, 2] - a[:, 0]) * (a[:, 3] - a[:, 1]), dim=1) + area[i * step:(i + 1) * step] - iw
        ua = torch.clamp(ua, min=1e-8)
        iw.div_(ua)
        del ua

        IoU[:, i * step:(i + 1) * step] = iw

    return IoU

ehp avatar Aug 16 '19 12:08 ehp