human-pose-estimation.pytorch icon indicating copy to clipboard operation
human-pose-estimation.pytorch copied to clipboard

What does pixel_std for?

Open PaTricksStar opened this issue 6 years ago • 5 comments

https://github.com/Microsoft/human-pose-estimation.pytorch/blob/c3a30c0e1f83e73b3038b1a443becf6b4a19cf1f/lib/dataset/JointsDataset.py#L31 I review the code and find the pixel_std represents the std of human bbox area, right? But why we need to normalize the bbox scale and set it to 200?

PaTricksStar avatar Feb 19 '19 06:02 PaTricksStar

@PaTricksStar , I have the same question. Also, what about the scale = scale*1.25 in this function

def _xywh2cs(self, x, y, w, h):
        center = np.zeros((2), dtype=np.float32)
        center[0] = x + w * 0.5
        center[1] = y + h * 0.5

        if w > self.aspect_ratio * h:
            h = w * 1.0 / self.aspect_ratio
        elif w < self.aspect_ratio * h:
            w = h * self.aspect_ratio
        scale = np.array(
            [w * 1.0 / self.pixel_std, h * 1.0 / self.pixel_std],
            dtype=np.float32)
        if center[0] != -1:
            scale = scale * 1.25

return center, scale

rafikg avatar May 29 '19 16:05 rafikg

@Gouiaa This also is what confuse me. @leoxiaobin Could you please answer our questions?

wanghao14 avatar Jul 15 '19 05:07 wanghao14

@PaTricksStar @leoxiaobin @wanghao14 @rafikg Have you solved this? I am confused about it.

annopackage avatar May 08 '20 08:05 annopackage

I think It is just a hyper parameter representing the default w/h of the bounding box. Just leave it alone. Or you can try to email the author to verify .

PaTricksStar avatar May 22 '20 04:05 PaTricksStar

I think it is just a method they store values of bbox h and w. They divide h/w by 200 and then they get the h and w back in get_affine_transform by multiply scale by 200. It just a hyperparam and you could choose another number.

@rafikg As I say above, scale is just another representation of bbox h and w. I think they multiply scale with 1.25 to expand the bbox, in case the bbox fits the human body too much, which lead to information loss.

lqduc avatar Oct 01 '20 08:10 lqduc