tapnet
tapnet copied to clipboard
Use Case: Temporally Coherent Pose Estimation
Frameworks such as Mediapipe or OpenPose are used to extract skeletal keypoints from images. Unfortunately, the results are inconsistent and somewhat jittery when trying to extract poses from consecutive frames.
I propose a use case supported by tapir:
- Extract poses for an initial frame using mediapipe. Perhaps even for the whole video.
- Track the keypoints across frames. Prefer
tapir
's tracking. Iftapir
andmediapipe
diverge, fall back to the mediapipe pose and continue tracking from there.
This idea, similarly to how MP4 files work, considers P-frames as gold, mediapipe
poses, and I-frames, as long as consistent, from tapir
. When the data stored in the I-frame
is no longer consistent, introduce another P-frame. (this can also be done per-frame per-keypoint)
Related issue: https://github.com/qianqianwang68/omnimotion/issues/5
Have you made any progress on this? I'm considering using mediapipe keypoints as well
Have you made any progress on this? I'm considering using mediapipe keypoints as well
I haven't yet attempted an implementation. I think it would be really cool, once I (or someone) has the time to play with it
Closing due to inactivity. We don't currently have any work in this specific direction.