mediapipe
mediapipe copied to clipboard
Pose accuracy
I'm using the example iOS app (Target name: PoseTrackingGPU) in the mediapipe repo. iOS version 15.5. Code is running on an iPhone 12 Pro Max. Mediapipe version is 0.8.10.
Overall, this framework is incredible. I'm observing that in certain body positions, such as at the top of a golf swing (see attached screenshots), the shoulder positions are not correctly identified in 2D, which is resulting in the 3D measurements to be incorrect. I suspect that the root cause is that one of the shoulder joints isn't fully visible, so the pose detection model is guessing where it might be. So for example, in the first screenshot, the two shoulder joints are correctly identified (see white line connecting the two predicted shoulder positions), and all is good. In screenshots 2 and 3, I have drawn a dotted red line to indicate where the line connecting the shoulders should be - notice how the solid white lines, i.e., the predictions, are off by quite a bit.
I'm looking for some advice from the Google team and this community on how to improve the accuracy in such situations where some set of joints may not be fully visible for a short period of time. Is there a way to train/extend the underlying model with additional images that could help improve the model? Any other tips/ideas on how we might be able to improve the accuracy?
Many thanks.
Hi @nrkrishna , Recently we updated the released model to produced additional world 3D landmarks output with origin in hips center. More details can be found in API: https://google.github.io/mediapipe/solutions/pose#pose_world_landmarks And in the blog: https://blog.tensorflow.org/2021/08/3d-pose-detection-with-mediapipe-blazepose-ghum-tfjs.html
If you want to calculate relative depth between landmarks, this output should be enough for you. But if you want to calculate distance to the camera, extra work needs to be done. Having both 2D and 3D landmarks and knowing camera parameters you can apply some optimization (e.g. Procrustes analysis) to determine translation in 3D space.
hi @sureshdagooglecom - thanks for getting back to me. All that makes sense to me and is working as you have outlined. I'm in fact getting 3D landmarks on iOS.
My question is different. I'm asking about how to improve the accuracy of the pose detection as its inaccurate in certain positions, and I'm asking for ways to improve the accuracy. Please review my original question if you could again.
Thanks
hi @NikolayChirkov - wanted to follow up on the above question/request.
Thanks
@NikolayChirkov hello, I would like to have your input on this too
@nrkrishna, Are you still looking the resolution on this issue?
Hello, Yes, I would like to know how to solve this problem
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.
Closing as stale. Please reopen if you'd like to work on this further.
Hi @SuryaTKoppula, Could you please raise a new issue marking this issue as reference for problems you are facing?
Hello,
Yes, I have risen an issue with this reference