3d-pose-baseline icon indicating copy to clipboard operation
3d-pose-baseline copied to clipboard

Use own images

Open bylowerik opened this issue 6 years ago • 34 comments

Hi, I am interested in testing your network on my own images.

According to your answer on other questions here, one should use Stacked Hourglass to get predictions in 2D. That I have done and saved them in .h5 format.

I have also changed a bit in your code so that I do not need the 3D groundtruth in order to evaluate the loss.

The code now runs and uses my saved h5-files from Stacked Hourglass. However, the output 3D poses are completely messed up.

I have unNormalized the predicted 3D pose using the training data by calling:

    _, _, data_mean_3d, data_std_3d, dim_to_ignore_3d, dim_to_use_3d, train_root_positions, test_root_positions = data_utils.read_3d_data(
    actions, FLAGS.data_dir, FLAGS.camera_frame, rcams, FLAGS.predict_14 )

I attach a file for how the 2D-predictions look like and the resulting 3D pose. figure_1 screenshot_20171220_083110

bylowerik avatar Nov 22 '17 07:11 bylowerik

Hi @bylowerik,

It's hard for me to tell what is wrong given the information, but it seems like the joints have a strange permutation. Are you passing the --use_sh flag? That should trigger a reordering of the detections before producing the 3d detections so that it matches the ordering during training.

una-dinosauria avatar Dec 21 '17 05:12 una-dinosauria

Yes, of course it is difficult but maybe we can figure it out. I do pass the --use_sh flag. There are several stages where it can go wrong, so maybe I should post different parts of the code. However, I am on vacation until New Year so I won't be able to use my computer at work.

bylowerik avatar Dec 23 '17 12:12 bylowerik

Sure, please post an mvce an we can take it from there. Happy new year!

una-dinosauria avatar Dec 23 '17 17:12 una-dinosauria

Hello, I'm interested in the same question. Is your code available somewhere @bylowerik ? I just started playing with the code, I have managed to obtain the joints prediction from the Stacked Hourglass. However I don't know how to use the result for the 3D prediction. I would like to take a look at your code and help trying to find the problem.

sel3a avatar Dec 23 '17 19:12 sel3a

Hello. Working with own images would be great. @bylowerik, is your code available? I have some ideas to test main application, but I still need help with using my own images. Thank you.

prestal13 avatar Jan 02 '18 14:01 prestal13

say I have a 2d joints locations how can I get the 3d locations for them? In other words what should I change in the code or do to get the 3d locations?

hossam-96 avatar Jan 28 '18 15:01 hossam-96

If you can read Japanese here's a tutorial on how to use this code with OpenPose: http://akasuku.blog.jp/archives/73745862.html

una-dinosauria avatar Feb 08 '18 05:02 una-dinosauria

@una-dinosauria, thanks for the link, I have followed his instructions and can speak of a happy outcome. sandbox under forked Repo checkout output please, thx.

ArashHosseini avatar Feb 17 '18 22:02 ArashHosseini

@bylowerik

Can you share the code you used to save the SH predictions as .h5 please?

larrypm avatar Mar 14 '18 11:03 larrypm

Hi,

my results were very strange so I am not sure if you want that?

Regards Erik

2018-03-14 12:26 GMT+01:00 larrypm [email protected]:

@bylowerik https://github.com/bylowerik

Can you share the code you used to save the SH predictions as .h5 please?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/una-dinosauria/3d-pose-baseline/issues/33#issuecomment-372988844, or mute the thread https://github.com/notifications/unsubscribe-auth/ADmi-siGAKakDX1aHBMN90yAhol7HRV6ks5teP5RgaJpZM4Qm_IV .

bylowerik avatar Mar 14 '18 12:03 bylowerik

Hi Erik,

Sure, I'd still like it anyway as I can modify from there. I am mostly interested in the flow for saving the SH into .h5 files. I'd really appreciate if you can share that.

Best, Larry

larrypm avatar Mar 19 '18 06:03 larrypm

@prestal13 Hello , I want to work with my own images , but I am not familiar with h5 . Have you fixed the problem ? If so , may I ask for some help ?

guofuzheng avatar Jun 05 '18 01:06 guofuzheng

Hi, sorry. My code is gone since I have changed OS and formatted the hard drive.

2018-06-05 3:54 GMT+02:00 郭伏正 [email protected]:

@prestal13 https://github.com/prestal13 Hello , I want to work with my own images , but I am not familiar with h5 . Have you fixed the problem ? If so , may I ask for some help ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/una-dinosauria/3d-pose-baseline/issues/33#issuecomment-394555310, or mute the thread https://github.com/notifications/unsubscribe-auth/ADmi-lRLbvhUG_q6rN_H1-sQozsK5FH-ks5t5eS-gaJpZM4Qm_IV .

bylowerik avatar Jun 05 '18 05:06 bylowerik

@ArashHosseini Great work! I've tried your code, but the output was a little different from what I expected.

I have a question: Openpose now outputs 25 pairs of body keypoints, while this paper used a dataset with 16 pairs of body keypoints each person, and keypoint ordering differs between this two methods. How did you solve this mismatch in your work? (Sorry, I didn't read your code very carefully line by line)

I'll appreciate it if you can explain it to me.

liuyu666-thu avatar Jul 04 '18 07:07 liuyu666-thu

Hey @quinnliu4real thx

but the output was a little different from what I expected.

can you please explain that more exactly, would help me too. Can you please tell me where the 25pairs of body keypoints are coming from, I've kept explicit on the docu ...the docu may not be updated!?

ArashHosseini avatar Jul 04 '18 08:07 ArashHosseini

I am not sure about my output so allow me to explain it later.

Let me explain the 25-pair thing. When I call openpose API, BODY_25 model is used, so the output JSON is like: 'people':[{pose_keypoints_2d}:[...LIST...]] In that LIST, 75 numbers are listed, each 3 representing 1 point, so 25 pairs in total.

Am I right? If I misunderstand something, please tell me.

I refer to the same page as you linked, 'Keypoint Ordering', above these keypoints pics.

liuyu666-thu avatar Jul 04 '18 10:07 liuyu666-thu

@quinnliu4real I see thx, unfortunately So far I have only used the COCO_18 model, I do not think the Body25 model is suitable here. There are joints that are not significant for here like "REye" or 17 "REar" etc.

ArashHosseini avatar Jul 04 '18 14:07 ArashHosseini

@ArashHosseini Thanks for your comment. I think that is probably the reason why my output skeleton faces left while actually the man in my video faces right.

I'll give a try on COCO_18 model. Hope it is not too complicated to change a model.

liuyu666-thu avatar Jul 04 '18 14:07 liuyu666-thu

@quinnliu4real sounds good, please let me know how it ends. thx

ArashHosseini avatar Jul 04 '18 14:07 ArashHosseini

@ArashHosseini Hey dude, I am here to tell you I achieved a pretty good result with COCO_18 model !

However there is still one thing that I'm not satisfied with: pose_frame_000000000003 This skeleton seems to be lowering his head while actually the man in the video is standing very straightly. I guess there is still a mismatch between the keypoints ordering we get from openpose and the ordering of keypoints from the dataset this paper used. I'm gonna check the paper: Stacked Hourglass.

Anyway thanks again, my work can go on thanks to you.

liuyu666-thu avatar Jul 05 '18 08:07 liuyu666-thu

@quinnliu4real, that's great dude, if you have further questions or suggestions, please let me know

ArashHosseini avatar Jul 05 '18 10:07 ArashHosseini

@ArashHosseini

Thanks for your helpful work. I have one question.

I do not understand lines 279-282 of openpose_3dpose_sandbox.py of your code.

enc_in = enc_in[:, dim_to_use_2d] mu = data_mean_2d[dim_to_use_2d] stddev = data_std_2d[dim_to_use_2d] enc_in = np.divide((enc_in - mu), stddev)

The input of the baseline network is enc_in and thie input tensor should be standard normalized.

Actually, in order to do standard normalization, we should use the statistics (i.e., mean, std) obtained from similar images as our query image (which can be different resolution from Human 3.6m dataset images)

However, the code uses the statistic (i.e., mean, std) obtained from the 2d points extracted from the human3.6M dataset by stacked hourglass model.

In my experiment, the result of the predicted 3D pose and visualization results are weird.

I think this because, the used statistics from Human 3.6M.

Would you give me any suggestion or comment?

Thanks

csehong avatar Jul 24 '18 11:07 csehong

@ArashHosseini Hi, thanks for your code. But when I used your code, I am not sure if there are some errors? (1) in the smooth part https://github.com/ArashHosseini/3d-pose-baseline/blob/9fbdc7ed3aa9a870fb6a77688bd016372c0c87e0/src/openpose_3dpose_sandbox.py#L145-L155 why we use smoothed not cache? smoothed is still blank right?

(2) https://github.com/ArashHosseini/3d-pose-baseline/blob/9fbdc7ed3aa9a870fb6a77688bd016372c0c87e0/src/openpose_3dpose_sandbox.py#L320-L321 what's the before_pose here? 'before_pose' get referenced before assignment

KevinQian97 avatar Sep 08 '18 22:09 KevinQian97

@KevinQian97 @ArashHosseini for (2), I think the assumption is that the first pose is always well-recovered. before_pose is the last predicted 3d pose. Essentially, if the predicted pose looks wayy wrong, we can just re-use the last predicted 3d pose as an estimate for the current frame, assuming some temporal consistency.

ajdroid avatar Sep 12 '18 00:09 ajdroid

@ArashHosseini Thanks for your comment. I think that is probably the reason why my output skeleton faces left while actually the man in my video faces right.

I'll give a try on COCO_18 model. Hope it is not too complicated to change a model.

Hello, sir. Could you please tell me how to change the model?

LiangZhengIQka avatar Mar 04 '19 14:03 LiangZhengIQka

@LiangZhengIQka, you have to do nothing, it will automatically map the data: mapping

ArashHosseini avatar Mar 04 '19 15:03 ArashHosseini

Hello, sir Could you tell me how to validate 3d-pose-baseline by using the datasets from human3.6m? I read both coordinates we get from human3.6m ground truth and the result generated by 3d-pose-baseline, it is totally different.

Best regards, Zheng Liang

发件人: ArashHosseini 发送时间: 2019年3月4日 16:28 收件人: una-dinosauria/3d-pose-baseline 抄送: LiangZhengIQka; Mention 主题: Re: [una-dinosauria/3d-pose-baseline] Use own images (#33)

@LiangZhengIQka, you have to do nothing, it will automatically map the data: https://github.com/ArashHosseini/3d-pose-baseline/blob/96a500b472a55f8915f4d26e200fdef6e65fc179/src/openpose_3dpose_sandbox.py#L74 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

LiangZhengIQka avatar Mar 17 '19 14:03 LiangZhengIQka

Hi, @ArashHosseini @liuyu666-thu I wanted to test my own videos based on both open pose and stacked hour glass network if possible. I wanted to test it on videos stored locally on my PC . Can you guys guide me through it ? I am quite new to this topic.

ns3new avatar Mar 21 '19 18:03 ns3new

Hi, @ArashHosseini @liuyu666-thu I wanted to test my own videos based on both open pose and stacked hour glass network if possible. I wanted to test it on videos stored locally on my PC . Can you guys guide me through it ? I am quite new to this topic.

You can have a try on Arash's great repo https://github.com/ArashHosseini/3d-pose-baseline. First use openpose to get 2D keypoints and write the results out in json format and then run Arash's code. For detailed guide, you may refer to his README.

liuyu666-thu avatar Mar 22 '19 01:03 liuyu666-thu

@una-dinosauria, thanks for the link, I have followed his instructions and can speak of a happy outcome. sandbox under forked Repo checkout output please, thx.

Thank you so much for this implementation!

In estimator.py (inside tf-pose folder) what is this code for? Why take the x and y and multiply them by the imafe width and heigh, then add 0.5? Is this some sort of a necessary preprocessing step to feed to 3d pose baseline? if so, why this particular operation?

Thank you!

`

            body_part = human.body_parts[i]
            center = (int(body_part.x * image_w + 0.5), int(body_part.y * image_h + 0.5))
            centers[i] = center
            #add x
            flat[i*2] = center[0]
            #add y
            flat[i*2+1] = center[1]`

ironllamagirl avatar Apr 24 '19 10:04 ironllamagirl