mxnet_Realtime_Multi-Person_Pose_Estimation icon indicating copy to clipboard operation
mxnet_Realtime_Multi-Person_Pose_Estimation copied to clipboard

implement the heatmap and part affine field with gpu, and discuss the training process.

Open li-haoran opened this issue 7 years ago • 12 comments

i implement the heatmap and paf generate as operators. https://github.com/li-haoran/mxnet-cmu_pose, and implement some image data augmentations method. and train with different lr_mult. now noly on the MPII, and the train error seems not convergence. i hope you can share your training logs .

li-haoran avatar Nov 28 '17 14:11 li-haoran

Good implementation. Maybe I could borrow some code from you. My training error seems to converge. Like the following 20 to 30 epochs. But My mAP performance is not improved after 20~30 epochs, only 0.2. I think maybe my data manipulation has some problems. Did you check the paf and heat map? I find that in my testing of generating label, some paf width to be too bigger than it should be. You seem to name "thre" in orignal code as beam_width? I wonder if the BEAM_WIDTH should change according to the human scale(My code also used the same thre).

cdef int min_x = max(int(round(min(centerA_x, centerB_x)) - thre), 0)
cdef int max_x = min(int(round(max(centerA_x, centerB_x))) + thre, grid_x)
cdef int min_y = max(int(round(min(centerA_y, centerB_y) - thre)), 0)
cdef int max_y = min(int(round(max(centerA_y, centerB_y) + thre)), grid_y)
paf level 1
198.691179194 193.14761207 189.601093894 186.31120778 183.617719643 180.939454209 178.889112464 176.328989767 173.550670097 171.171435292
heat map level 1
55.3489264697 53.3072111939 52.1945885241 51.1635073535 50.3665172849 49.5429850052 48.9504259197 48.1725841253 47.3175634113 46.6088191207

dragonfly90 avatar Nov 28 '17 17:11 dragonfly90

Thanks for your sharing result. Mine seems going to bad result.

i also realize this human scale issue, so i use the beam_width is the ratio of the width and length of the part, so it will change with respect to different length of joints pair. In original code the thre is 1. so i'm not sure use ratio is a good way.

li-haoran avatar Nov 29 '17 01:11 li-haoran

I trained 100 epochs, actually almost 50~60 epochs convergence. The loss plot? is quiet small. the heatmap is perfect, but the paf is not good. i think my ratio 0.2 is too small. and the paf is range (-1,1) is also difficult to regression.

li-haoran avatar Dec 04 '17 01:12 li-haoran

I tried another network named deeplab, which converged quickly. I think, if you use the original openpose network, you must begin from a pretrained model, or you may found it can hardly converge.

kohillyang avatar Dec 05 '17 08:12 kohillyang

after i fix some bugs, the heatmap converge seems good. i post some samples and loss fig, both show the results is good.

li-haoran avatar Dec 05 '17 10:12 li-haoran

Hi, @li-haoran @dragonfly90 @ @kohillyang @dongzhuoyao @qingxiaoli how to understand the PAF's number is 38, and how to understand the code,

I'm confused the limSeq and mapIdx, I want to know why the author set the order of libSeq and mapldx as the code below: # find connection in the specified sequence, center 29 is in the position 15 limbSeq = [[2,3], [2,6], [3,4], [4,5], [6,7], [7,8], [2,9], [9,10],
[10,11], [2,12], [12,13], [13,14], [2,1], [1,15], [15,17],
[1,16], [16,18], [3,17], [6,18]] # the middle joints heatmap correpondence mapIdx = [[31,32], [39,40], [33,34], [35,36], [41,42], [43,44], [19,20], [21,22],
[23,24], [25,26], [27,28], [29,30], [47,48], [49,50], [53,54], [51,52],
[55,56], [37,38], [45,46]] What the author is based on? I'm the newbie about the multi-person pose estimation. I hope you can give me some advice or information. Thanks

Ai-is-light avatar Jan 26 '18 02:01 Ai-is-light

38=2x19, there is 19 keypoints, each point is a 2D-vector means PAF.

dongzhuoyao avatar Jan 26 '18 02:01 dongzhuoyao

@kohillyang would you mind sharing some tips about your another networks for faster training?

Ai-is-light avatar Jan 30 '18 02:01 Ai-is-light

Well, I just replace the network to deeplab network, all code can be found in folder deeplabhttps://github.com/dragonfly90/mxnet_Realtime_Multi-Person_Pose_Estimation/tree/master/deeplab, their is also a file named readme.md which describe the training process

kohillyang avatar Jan 30 '18 17:01 kohillyang

pretty @kohillyang Thanks

Ai-is-light avatar Jan 31 '18 02:01 Ai-is-light

@ @kohillyang But, anyway, I'm confused about your loss, why is it so smaller, I have some doubt about it

Ai-is-light avatar Jan 31 '18 02:01 Ai-is-light