human-pose-estimation.pytorch question about flip test

Hi, thanks for your code! But I have a question about flip test. I don't know why here you do a shift on output_flipped. The comments says that "feature is not aligned, shift flipped heatmap for higher accuracy", what's the meaning of this opreration? https://github.com/Microsoft/human-pose-estimation.pytorch/blob/8ed745798439f247c85c57392428320d4c553654/lib/core/function.py#L121-L125

Dec 19 '18 10:12 Liz66666

@YoungZiyu ''To stabilize the predictions, we evaluate both the original image and its flipped version, and average their output heatmaps.'' (this sentence is from 《Self Adversarial Training for Human Pose Estimation》) It is common that both the original image and its flipped version are used for for higher accuracy, because the predictions is not stable enough, to be specific, the prediction of same piont may not at the same position for twice. The operation of average their output heatmaps is also leveraged in CPN , Self Adversarial Training for Human Pose Estimation and Deep High-Resolution Representation Learning for Human Pose Estimation. you can visit this page for more details.

Mar 10 '19 08:03 738654805

But why is the one pixel shift on the flipped output performed?

Apr 16 '19 11:04 FrancescoPiemontese

Under 1-D situation, Assuming the input image size is 16, we annotate some 'important' pixels with digits from 1 to 8, other unimportant pixels are -, then the input image will look like [1, -, -, 2, 3, -, -, 4, 5, -, -, 6, 7, -, -, 8] then, due to padding mechanism in strided conv layers, on the stride 4 feature map (that size is 16 // 4 = 4), the pixel centers projected onto the input image will be [1, 3, 5, 7] on the flipped feature map, [8, 6, 4, 2], after flipping back, without shift, [2, 4, 6, 8], with shift, we get [2, 2, 4, 6]

Note the distance in the input image, [2,3], [4,5] and [6,7] are neighboring pixels, while [1,2], [3,4], [5,6] and [7,8] are not, they have a distance of stride - 1

Feb 04 '20 23:02 gen-ko

hi @gen-ko how to get the real shift for very complex model?

Nov 09 '20 06:11 zimenglan-sysu-512

@zimenglan-sysu-512 try this "The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation"

Nov 09 '20 07:11 DHCZ

hi @DHCZ, thanks

Nov 09 '20 10:11 zimenglan-sysu-512