first-order-model When the batch size is greater than 1, the total time does not decrease

When the batch size is greater than 1, the total time does not decrease

Open czy36mengfei opened this issue 2 years ago • 7 comments

I use demo.py for model inference.When the batch size is greater than 1, the total time does not decrease.How can I solve this problem?

Jan 14 '22 12:01 czy36mengfei

Why it should decrease?

Jan 14 '22 13:01 AliaksandrSiarohin

thanks，When gpu memory is sufficient, no matter how big the batchsize is, each reasoning time should not make much difference. The batch size becomes larger, the number of inferences becomes smaller, and the total time consumption should be reduced.However, I found that when the batchsize increases, the single reasoning time increases linearly with the increase of the batchsize, resulting in no change in the total reasoning time of all data.-------- 原始邮件 --------发件人： AliaksandrSiarohin @.>日期： 2022年1月14日周五 21:00收件人： AliaksandrSiarohin/first-order-model @.>抄送： czy36mengfei @.>, Author @.>主题： Re: [AliaksandrSiarohin/first-order-model] When the batch size is greater than 1, the total time does not decrease (Issue #491) Why it should decrease?

—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>

Jan 14 '22 13:01 czy36mengfei

How do you change bs?

Jan 14 '22 15:01 AliaksandrSiarohin

I made the following modifications to support different batch sizes.

def make_animation_batch(source_image, driving_video, generator, kp_detector,
                   relative=True, adapt_movement_scale=True, cpu=False):
    with torch.no_grad():
        predictions = []
        source = torch.tensor(
            source_image[np.newaxis].astype(np.float32)).permute(0, 3, 1, 2)
        if not cpu:
            source = source.cuda()
        driving = torch.tensor(
            np.array(driving_video).astype(np.float32)).permute(0,3,1,2)
        kp_source = kp_detector(source)
        kp_driving_initial = kp_detector(driving[0:1])
        batch_size = 16
        source = source.repeat([batch_size,1,1,1])
        kp_source_batch = {
            'value': kp_source['value'].repeat([batch_size, 1, 1]),
            'jacobian': kp_source['jacobian'].repeat([batch_size, 1, 1, 1])
        }
        data_len = driving.shape[0]
        for frame_idx in tqdm(range(0, data_len, batch_size)):
            left = frame_idx
            right = frame_idx + batch_size
            if right > data_len:
                right = data_len
                now_batch = right-left
                source = source[:now_batch]
                for key in kp_source_batch:
                    kp_source_batch[key] = kp_source_batch[key][:now_batch]

            driving_frame = driving[left:right]
            # print('left:%d, right:%d'%(left,right))
            if not cpu:
                driving_frame = driving_frame.cuda()
            kp_driving = kp_detector(driving_frame)

            kp_norm = normalize_kp(kp_source=kp_source, kp_driving=kp_driving,
                                   kp_driving_initial=kp_driving_initial,
                                   use_relative_movement=relative,
                                   use_relative_jacobian=relative,
                                   adapt_movement_scale=adapt_movement_scale)

            the_time = time.time()
            out = generator(source, kp_source=kp_source_batch, kp_driving=kp_norm)
            if left == 0:
                print('generator : ' + str(time.time() - the_time))
            predictions.extend(
                list(np.transpose(out['prediction'].data.cpu().numpy(),
                             [0, 2, 3, 1])))
    return predictions```

Jan 17 '22 08:01 czy36mengfei

Probably you gpu is fully utilized.

Jan 17 '22 10:01 AliaksandrSiarohin

My gpu is adequate.After running, I see that the gpu has several GB left.

Jan 18 '22 01:01 czy36mengfei

What is your gpu utilization?

Jan 18 '22 10:01 AliaksandrSiarohin

first-order-model first-order-model copied to clipboard

When the batch size is greater than 1, the total time does not decrease

first-order-model
first-order-model copied to clipboard