mediapipe icon indicating copy to clipboard operation
mediapipe copied to clipboard

How to process the output of pose detection and pose landmark models in a standalone c++ project

Open UtsaChattopadhyay opened this issue 2 years ago • 1 comments

I am working on building a standalone cpp project for detecting pose. In the project,I am using 2 of the mediapipe tflite models (pose_detection and pose_landmark), and the output dimensions of the models are attached below. For the pose detection, we have two outputs (2254, 12) and (2254, 1). What do these values correspond to, and how do we do the postprocessing on these values? On the mediapipe webpage, it says that the output of Pose detector is similar to Face detector + (human body center, radius, and rotation). Similarly, for Pose Landmarks, we have 5 outputs - (1,195), (1,1), (1,256,256,1), (1,64,54,39), and (1,117). As we have understood that (1,1) is a classifier, and (1,256,256,1) is a segmentation mask. However, the other 3 output values are not clear. Here, it says that the pose landmark model detects 33 landmarks in pixel and world space, where each landmark has 4 values (x,y,z, visibility). I am assuming that the shape should correspond to a total of 4 (x,y,z,visiblity) * 33 (landmarks) * 2 (pixel and world space). Can you please let me know how to make sense of these two model outputs, and also the post-processing related to them?

Pose Detection Output

pose_detection

Pose Landmark Output

pose_landmark

UtsaChattopadhyay avatar Jul 20 '22 03:07 UtsaChattopadhyay

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler[bot] avatar Aug 09 '22 14:08 google-ml-butler[bot]

Closing as stale. Please reopen if you'd like to work on this further.

google-ml-butler[bot] avatar Aug 16 '22 14:08 google-ml-butler[bot]

Are you satisfied with the resolution of your issue? Yes No

google-ml-butler[bot] avatar Aug 16 '22 14:08 google-ml-butler[bot]

Closing as stale. Please reopen if you'd like to work on this further.

google-ml-butler[bot] avatar Aug 27 '22 14:08 google-ml-butler[bot]

Are you satisfied with the resolution of your issue? Yes No

google-ml-butler[bot] avatar Aug 27 '22 14:08 google-ml-butler[bot]

Hi @UtsaChattopadhyay , did you figure this out?

JesperStenberg avatar Sep 22 '22 19:09 JesperStenberg

Yes we did

On Fri, Sep 23, 2022, 03:39 JesperStenberg @.***> wrote:

Hi @UtsaChattopadhyay https://github.com/UtsaChattopadhyay , did you figure this out?

— Reply to this email directly, view it on GitHub https://github.com/google/mediapipe/issues/3532#issuecomment-1255468837, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKS5W3JWFSY4WK5SJYNT643V7SYW7ANCNFSM54CDCVXQ . You are receiving this because you were mentioned.Message ID: @.***>

UtsaChattopadhyay avatar Sep 23 '22 06:09 UtsaChattopadhyay

Yes we did On Fri, Sep 23, 2022, 03:39 JesperStenberg @.> wrote: Hi @UtsaChattopadhyay https://github.com/UtsaChattopadhyay , did you figure this out? — Reply to this email directly, view it on GitHub <#3532 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKS5W3JWFSY4WK5SJYNT643V7SYW7ANCNFSM54CDCVXQ . You are receiving this because you were mentioned.Message ID: @.>

Do you mind sharing it regarding the pose_landmark model? We get a [1, 195] output from "Identity". I've come to understand that this represent 39 arrays of [x, y, z, visibility, presence] (33 points on the body + 6 extra points for next frame tracking).

This works very well when the input scene is easy, x and y tracks perfectly to the image. But if the person is partially of screen the tracking fails completely, which it doesn't do in the Mediapipe examples.

Do you have any insights?

JesperStenberg avatar Sep 23 '22 10:09 JesperStenberg

@JesperStenberg or @UtsaChattopadhyay , did one of you figure it out for the pose_landmark model? Where did you find the information about 33 + 6 landmarks? Are those 6 at the end or the beginning of the array?

EinePriseCode avatar Jan 21 '23 18:01 EinePriseCode

It was a while ago and I don't have it available, but i'm pretty sure that those 6 are at the end of the array. The thing that threw me off was that the person needs to be centred in the image for the model to work.

If you haven't checked this link it might have some good info.

JesperStenberg avatar Jan 21 '23 19:01 JesperStenberg

Thanks @JesperStenberg, that was an important hint. Unfortunately I cant find any doc which explains the output in more detail which makes implementing harder and less clean.

EinePriseCode avatar Jan 21 '23 20:01 EinePriseCode

If we have a high enough frame rate could we find the acceleration and velocity of not only the center but the limbs also? Random 3 am idea hoping to get some feedback and minimum frame rate for usable results

Lakshyadevelops avatar Feb 09 '23 21:02 Lakshyadevelops

Hello @UtsaChattopadhyay, We are upgrading the MediaPipe Legacy Solutions to new MediaPipe solutions However, the libraries, documentation, and source code for all the MediapPipe Legacy Solutions will continue to be available in our GitHub repository and through library distribution services, such as Maven and NPM.

You can continue to use those legacy solutions in your applications if you choose. Though, we would request you to check new MediaPipe solutions which can help you more easily build and customize ML solutions for your applications. These new solutions will provide a superset of capabilities available in the legacy solutions. Thank you

kuaashish avatar May 05 '23 10:05 kuaashish

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions[bot] avatar May 13 '23 01:05 github-actions[bot]

This issue was closed due to lack of activity after being marked stale for past 7 days.

github-actions[bot] avatar May 20 '23 01:05 github-actions[bot]

Are you satisfied with the resolution of your issue? Yes No

google-ml-butler[bot] avatar May 20 '23 01:05 google-ml-butler[bot]