mediapipe Pose accuracy

trafficstars

I'm using the example iOS app (Target name: PoseTrackingGPU) in the mediapipe repo. iOS version 15.5. Code is running on an iPhone 12 Pro Max. Mediapipe version is 0.8.10.

Overall, this framework is incredible. I'm observing that in certain body positions, such as at the top of a golf swing (see attached screenshots), the shoulder positions are not correctly identified in 2D, which is resulting in the 3D measurements to be incorrect. I suspect that the root cause is that one of the shoulder joints isn't fully visible, so the pose detection model is guessing where it might be. So for example, in the first screenshot, the two shoulder joints are correctly identified (see white line connecting the two predicted shoulder positions), and all is good. In screenshots 2 and 3, I have drawn a dotted red line to indicate where the line connecting the shoulders should be - notice how the solid white lines, i.e., the predictions, are off by quite a bit.

I'm looking for some advice from the Google team and this community on how to improve the accuracy in such situations where some set of joints may not be fully visible for a short period of time. Is there a way to train/extend the underlying model with additional images that could help improve the model? Any other tips/ideas on how we might be able to improve the accuracy?

Many thanks.

Jun 15 '22 20:06 nrkrishna

Hi @nrkrishna , Recently we updated the released model to produced additional world 3D landmarks output with origin in hips center. More details can be found in API: https://google.github.io/mediapipe/solutions/pose#pose_world_landmarks And in the blog: https://blog.tensorflow.org/2021/08/3d-pose-detection-with-mediapipe-blazepose-ghum-tfjs.html

If you want to calculate relative depth between landmarks, this output should be enough for you. But if you want to calculate distance to the camera, extra work needs to be done. Having both 2D and 3D landmarks and knowing camera parameters you can apply some optimization (e.g. Procrustes analysis) to determine translation in 3D space.

Jun 20 '22 07:06 sureshdagooglecom

hi @sureshdagooglecom - thanks for getting back to me. All that makes sense to me and is working as you have outlined. I'm in fact getting 3D landmarks on iOS.

My question is different. I'm asking about how to improve the accuracy of the pose detection as its inaccurate in certain positions, and I'm asking for ways to improve the accuracy. Please review my original question if you could again.

Thanks

Jun 20 '22 14:06 nrkrishna

hi @NikolayChirkov - wanted to follow up on the above question/request.

Thanks

Jul 12 '22 02:07 nrkrishna

@NikolayChirkov hello, I would like to have your input on this too

Nov 14 '22 21:11 SuryaTKoppula

@nrkrishna, Are you still looking the resolution on this issue?

Jan 06 '23 06:01 kuaashish

Hello, Yes, I would like to know how to solve this problem

Jan 06 '23 06:01 SuryaTKoppula

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

Jan 13 '23 07:01 google-ml-butler[bot]

Closing as stale. Please reopen if you'd like to work on this further.

Jan 20 '23 08:01 google-ml-butler[bot]

Are you satisfied with the resolution of your issue? Yes No

Jan 20 '23 08:01 google-ml-butler[bot]

Hi @SuryaTKoppula, Could you please raise a new issue marking this issue as reference for problems you are facing?

Jan 20 '23 10:01 kuaashish

Hello,

Yes, I have risen an issue with this reference

Jan 20 '23 16:01 SuryaTKoppula

mediapipe mediapipe copied to clipboard

Pose accuracy

mediapipe
mediapipe copied to clipboard