mmpose Bottom-up测试过程中的图像resize过程究竟是怎样的？我发现它并不会将输入图像resize到我在config文件data

这是我在配置文件中设置的：

data_cfg = dict(
    image_size=256,
    base_size=128,
    base_sigma=2,
    heatmap_size=[64, 128],
    num_joints=channel_cfg['dataset_joints'],
    dataset_channel=channel_cfg['dataset_channel'],
    inference_channel=channel_cfg['inference_channel'],
    num_scales=2,
    scale_aware_sigma=False,
)

我知道这里的base_size是用作multi scale test的，但是我在test_cfg里的设置是scale_factor=[1]，也就是说只有一个缩放比例，就是它本身，但是我在测试过程中打印了data_loader，修改代码如下：

def single_gpu_test(model, data_loader):
    """Test model with a single gpu.

    This method tests model with a single gpu and displays test progress bar.

    Args:
        model (nn.Module): Model to be tested.
        data_loader (nn.Dataloader): Pytorch data loader.


    Returns:
        list: The prediction results.
    """

    model.eval()
    results = []
    dataset = data_loader.dataset
    prog_bar = mmcv.ProgressBar(len(dataset))
    for data in data_loader:
        print("data:", data)
        with torch.no_grad():
            result = model(return_loss=False, **data)
        results.append(result)

        # use the first key as main key to calculate the batch size
        batch_size = len(next(iter(data.values())))
        for _ in range(batch_size):
            prog_bar.update()
    return results

得到了这样的结果：

load checkpoint from local path: outputs/epoch_5.pth
[                                                  ] 0/8000, elapsed: 0s, ETA:data: {'img': tensor([[[[ 49,  60,  64],
          [ 51,  58,  64],
          [ 57,  62,  68],
          ...,
          [ 34,  17,  10],
          [ 30,  15,  10],
          [ 33,  20,  14]],

         [[ 51,  60,  65],
          [ 52,  59,  65],
          [ 58,  63,  69],
          ...,
          [ 32,  15,   8],
          [ 32,  17,  12],
          [ 32,  17,  12]],

         [[ 53,  62,  69],
          [ 54,  61,  67],
          [ 58,  63,  69],
          ...,
          [ 31,  16,   9],
          [ 34,  19,  12],
          [ 30,  15,  10]],

         ...,

         [[188, 182, 186],
          [188, 182, 186],
          [187, 182, 186],
          ...,
          [196, 194, 199],
          [197, 195, 200],
          [197, 195, 200]],

         [[187, 181, 185],
          [187, 181, 185],
          [187, 181, 185],
          ...,
          [194, 192, 197],
          [194, 192, 197],
          [194, 192, 197]],

         [[188, 179, 184],
          [184, 182, 185],
          [183, 182, 187],
          ...,
          [194, 192, 197],
          [194, 192, 197],
          [196, 194, 199]]]], dtype=torch.uint8), 'img_metas': DataContainer([[{'image_file': 'data/crowdpose/images/106848.jpg', 'aug_data': [tensor([[[[-2.1179, -2.1179, -2.1179,  ..., -2.1179, -2.1179, -2.1179],
          [-2.1179, -2.1179, -2.1179,  ..., -2.1179, -2.1179, -2.1179],
          [-2.1179, -2.1179, -2.1179,  ..., -2.1179, -2.1179, -2.1179],
          ...,
          [-2.1179, -2.1179, -2.1179,  ..., -2.1179, -2.1179, -2.1179],
          [-2.1179, -2.1179, -2.1179,  ..., -2.1179, -2.1179, -2.1179],
          [-2.1179, -2.1179, -2.1179,  ..., -2.1179, -2.1179, -2.1179]],

         [[-2.0357, -2.0357, -2.0357,  ..., -2.0357, -2.0357, -2.0357],
          [-2.0357, -2.0357, -2.0357,  ..., -2.0357, -2.0357, -2.0357],
          [-2.0357, -2.0357, -2.0357,  ..., -2.0357, -2.0357, -2.0357],
          ...,
          [-2.0357, -2.0357, -2.0357,  ..., -2.0357, -2.0357, -2.0357],
          [-2.0357, -2.0357, -2.0357,  ..., -2.0357, -2.0357, -2.0357],
          [-2.0357, -2.0357, -2.0357,  ..., -2.0357, -2.0357, -2.0357]],

         [[-1.8044, -1.8044, -1.8044,  ..., -1.8044, -1.8044, -1.8044],
          [-1.8044, -1.8044, -1.8044,  ..., -1.8044, -1.8044, -1.8044],
          [-1.8044, -1.8044, -1.8044,  ..., -1.8044, -1.8044, -1.8044],
          ...,
          [-1.8044, -1.8044, -1.8044,  ..., -1.8044, -1.8044, -1.8044],
          [-1.8044, -1.8044, -1.8044,  ..., -1.8044, -1.8044, -1.8044],
          [-1.8044, -1.8044, -1.8044,  ..., -1.8044, -1.8044, -1.8044]]]])], 'test_scale_factor': [1], 'base_size': (448, 256), 'center': array([320, 212]), 'scale': array([3.71875, 2.125  ]), 'flip_index': [1, 0, 3, 2, 5, 4, 7, 6, 9, 8, 11, 10, 12, 13]}]])}

我发现实际输入到网络的图像尺寸是256 x 448，不是我想要的256 x 256，请问如何修改配置才能使得输入大小是我想要的结果？我没有使用UDP，我测试的数据集是crowdpose。

May 01 '22 17:05 calmiLovesAI

Please use English or English & Chinese for issues so that we could have broader discussion.

May 01 '22 17:05 mm-assistant[bot]

During training, the image will be resized to 256x256. But during testing, the short length will be resized to 256, and keep the image ratio fixed. So you will get an image whose shape is 256x448.

Please refer to https://github.com/open-mmlab/mmpose/blob/2dc9c31952a794ee545926e705050528e0762220/mmpose/datasets/pipelines/bottom_up_transform.py#L16

May 02 '22 09:05 jin-s13

If you like, you can use the training pipeline.

May 02 '22 09:05 jin-s13

If you like, you can use the training pipeline.

Thanks for your reply! 如果我直接在函数_get_multi_scale_size里面把resize代码修改成宽高resize到固定长度，对最后的验证准确度有影响吗？

May 02 '22 09:05 calmiLovesAI

抱歉，回复比较迟。应该是有一些影响的。

Jun 28 '22 08:06 jin-s13

mmpose
mmpose copied to clipboard

Bottom-up测试过程中的图像resize过程究竟是怎样的？我发现它并不会将输入图像resize到我在config文件data_cfg中设置的那样

mmpose mmpose copied to clipboard

Bottom-up测试过程中的图像resize过程究竟是怎样的？我发现它并不会将输入图像resize到我在config文件data_cfg中设置的那样

mmpose
mmpose copied to clipboard