deep-high-resolution-net.pytorch How to get person detection?

How to get person detection?

Open FrancescoPiemontese opened this issue 5 years ago • 15 comments

First of all thank you for your excellent work. I have a question regarding person detection. In your paper it is mentioned that you use a person detector before feeding its output to the HRNet. Am I supposed download this separately and then feed its output to the HRNet? If so, what do the dataloaders in train and test.py do? Would it be possible for you to tell me which person detector has been used?

Apr 15 '19 11:04 FrancescoPiemontese

I think the author used the detection information from the dataset.(mpii dataset json file 'center' 'scale')

Apr 16 '19 03:04 njustczr

@FrancescoPiemontese maybe you can refer to this myhrnet, I integrated yolo human detection.

Apr 18 '19 02:04 lxy5513

Thank you! I will try

Apr 18 '19 10:04 FrancescoPiemontese

@lxy5513 , will you consider make a PR to this repo?

Apr 19 '19 06:04 leoxiaobin

@leoxiaobin yes, soon after, I will add several human detection, like R-FCN, RetineNet, then do PR and speed description.

Apr 19 '19 06:04 lxy5513

@leoxiaobin I prepare to do this track by your simple-baseline paper description

For the processing frame in videos, the boxes from a human detector and boxes generated by propagating joints from previous frames using optical ﬂow are uniﬁed using a bounding box Non-Maximum Suppression (NMS) operation

I have two group boxes, but I don't how to do NMS, because boxes generated by flownet2S, which have no confidence scoces, could I can default think the score is previous frame boxes scors ? Could you tell me the problem, thank you advance.

Apr 19 '19 06:04 lxy5513

We actually use the OKS score for NMS.

Apr 19 '19 15:04 leoxiaobin

Thanks

Apr 23 '19 01:04 lxy5513

@leoxiaobin Hi, I made a PR for yolov3-HRnet, however something wired. I use two ways.

ONE, I get dt_boxes from yolo then python tools/test.py TEST.USE_GT_BBOX False TEST.FLIP_TEST False, and get rid of oks nms, get result as follow:

Arch	AP	Ap .5	AP .75	AP (M)	AP (L)	AR	AR .5	AR .75	AR (M)	AR (L)
pose_hrnet	0.702	0.859	0.770	0.653	0.779	0.736	0.878	0.794	0.683	0.813

TWO, I use end-to-end two model(same model like ONE), and get keypoins, then save into json, finally I get result by official cocoEval.evaluate() , as follow :

Arch	AP	Ap .5	AP .75	AP (M)	AP (L)	AR	AR .5	AR .75	AR (M)	AR (L)
hrnet	0.594	0.811	0.656	0.564	0.651	0.647	0.834	0.704	0.601	0.713

Could you please tell why the two results is so different, Thank you in advance

Apr 25 '19 11:04 lxy5513

this is my script https://github.com/lxy5513/hrnet/blob/master/tools/eval.py which get keypoints json file , by the way, my YOLOv3 threshold is 0.1

Apr 25 '19 11:04 lxy5513

I have a very quick look through your code. I have two questions.

It seems that you do not convert image's channel to RGB. Opencv reads image as BGR channel. Our model are trained using RGB channel. So you need first convert your image data to RGB channel like line131 at https://github.com/leoxiaobin/deep-high-resolution-net.pytorch/blob/master/lib/dataset/JointsDataset.py#L131.
Are the threshold for both metheds same?

Apr 25 '19 16:04 leoxiaobin

I am greatly appreciate for your attention this is my convert channel code: https://github.com/lxy5513/hrnet/blob/master/tools/eval.py#L142 .

this is my relative threshold code, they are same for two methods. https://github.com/lxy5513/hrnet/blob/master/tools/eval.py#L159

Apr 26 '19 01:04 lxy5513

By the way, my use yolov3 + simple-baseline pose model , test the PR, it seem normal, as follow:

Arch	AP	Ap .5	AP .75	AP (M)	AP (L)	AR	AR .5	AR .75	AR (M)	AR (L)
Simple-baseline	0.648	0.856	0.708	0.617	0.706	0.697	0.880	0.750	0.652	0.763

Apr 26 '19 09:04 lxy5513

I would say this issue can be closed with #161 being merged

Feb 03 '20 21:02 alex9311

@leoxiaobin Hi, I made a PR for yolov3-HRnet, however something wired. I use two ways.

ONE, I get dt_boxes from yolo then python tools/test.py TEST.USE_GT_BBOX False TEST.FLIP_TEST False, and get rid of oks nms, get result as follow:

Arch AP Ap .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L) pose_hrnet 0.702 0.859 0.770 0.653 0.779 0.736 0.878 0.794 0.683 0.813 TWO, I use end-to-end two model(same model like ONE), and get keypoins, then save into json, finally I get result by official cocoEval.evaluate() , as follow :

Arch AP Ap .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L) hrnet 0.594 0.811 0.656 0.564 0.651 0.647 0.834 0.704 0.601 0.713 Could you please tell why the two results is so different, Thank you in advance

i also get 0.702,but the implementation about w32_256*192 is 0.744,why?i just run the implementaion code with the trained model pose_hrnet_w32_256x192.pth.can you help me?

Mar 24 '21 14:03 zhanghao5201

deep-high-resolution-net.pytorch deep-high-resolution-net.pytorch copied to clipboard

How to get person detection?

deep-high-resolution-net.pytorch
deep-high-resolution-net.pytorch copied to clipboard