HumanDepth Difference between previous paper?

trafficstars

Hello. Thank you for your great work!!

Compared to the paper in ICCV 2019 Paper, Moon et.al, did you only changed the RootNet part to your approach HDNet which estimates the (Xr, Yr,ZR)?

Is the detection part(Mask-RCNN), root-relative 3D pose(PoseNet) and 3D Pose Visualization are all same as the ICCV 2019 paper??

Aug 26 '20 16:08 YangJae96

@YangJae96,

Yes. The bounding box detector and the root-relative 3D pose estimator are the same as in the ICCV paper.

Aug 27 '20 04:08 jiahaoLjh

Thanks!!

Is there a demo code to obtain Roots from my custom images?? root

I want to see the roots like this! Should I change the code?!

Aug 27 '20 13:08 YangJae96

@YangJae96,

For inference on custom images, you will have to edit the data loader accordingly in data/dataset.py to provide both the image and the bounding box. See #2.

Both the 2D pose and the root joint depth are produced as the output of the model https://github.com/jiahaoLjh/HumanDepth/blob/fba1c6669d09418b1a4bd648a9f4021821ca4037/test.py#L99-L100 which you may consider visualizing with your own code.

Aug 28 '20 03:08 jiahaoLjh

@jiahaoLjh ,

Sorry. I could not understand the reason to fix the dataset.py.

Isn't dataset.py only for Human36M preprocessing? If I want to inference my own image(when I have the BBox of humans from Detectron2), could I just put my image and BBox into the model and get the outputs of 2D joint and root joint Depth??

model I checked the model input part but its difficult to make in general.

Is the BBox_mask the Bounding box coordinates(x,y) of a one person in an image??
I can see the coord_map is for "Normalized image coordinates with focal length fx, fy divided from original image coordinates". But if I don't know the focal length of an image, is there a way to make the input to the model??

Thanks in advance!!

Aug 30 '20 01:08 YangJae96

@YangJae96

dataset.py is for preparing input data for the model. You could edit this file to replace data from Human36M with your own image samples. To do that, you need to provide both the image and one bounding box each time (for multi-person case). bbox_masks is simply a binary mask indicating the region of a bounding box.

If you don't know the focal length, you can simply set a reasonable one by yourself. The coord_map is taken care by dataset.py which you don't have to prepare by yourself.

Sep 07 '20 03:09 jiahaoLjh

HumanDepth HumanDepth copied to clipboard

Difference between previous paper?

HumanDepth
HumanDepth copied to clipboard