mediapipe icon indicating copy to clipboard operation
mediapipe copied to clipboard

Pose landmarker detect_for_video runs extremly slow

Open tomershukhman opened this issue 11 months ago • 1 comments

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

Macos sanoma

MediaPipe Tasks SDK version

No response

Task name (e.g. Image classification, Gesture recognition etc.)

Pose Landmarker task

Programming Language and version (e.g. C++, Python, Java)

python

Describe the actual behavior

I am trying to implament code that detact poses in a video using tasks. However when running my code it is if it was in slow mo. I have tested the code from https://github.com/nicknochnack/MediaPipePoseEstimation/blob/main/Media%20Pipe%20Pose%20Tutorial.ipynb and it with the minor modifcation of cap =cv2.VideoCapture('tennis.mp4') to make sure apples it's comparing apples (thier code is for streaming) and runs perfectly.

Describe the expected behaviour

For the pose detaction to keep up with the video

Standalone code/steps you may have used to try to get what you need


import cv2

from mediapipe.framework.formats import landmark_pb2
import numpy as np
import mediapipe as mp

model_path = "pose_landmarker.task"
BaseOptions = mp.tasks.BaseOptions
PoseLandmarker = mp.tasks.vision.PoseLandmarker
PoseLandmarkerOptions = mp.tasks.vision.PoseLandmarkerOptions
PoseLandmarkerResult = mp.tasks.vision.PoseLandmarkerResult
VisionRunningMode = mp.tasks.vision.RunningMode

options = PoseLandmarkerOptions(
    base_options=BaseOptions(model_asset_path=model_path),
    running_mode=VisionRunningMode.VIDEO,
)


def draw_landmarks_on_image(rgb_image, detection_result):
    pose_landmarks_list = detection_result.pose_landmarks
    # pose_landmarks_list = detection_result
    annotated_image = np.copy(rgb_image)

    # Loop through the detected poses to visualize.
    for idx in range(len(pose_landmarks_list)):
        pose_landmarks = pose_landmarks_list[idx]

        # Draw the pose landmarks.
        pose_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
        pose_landmarks_proto.landmark.extend([
            landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in pose_landmarks
        ])
        solutions.drawing_utils.draw_landmarks(
            annotated_image,
            pose_landmarks_proto,
            solutions.pose.POSE_CONNECTIONS,
            solutions.drawing_styles.get_default_pose_landmarks_style())
    return annotated_image

cap = cv2.VideoCapture("tennis.mp4")
from mediapipe import solutions

# Used as counter variable 
count = 0

# checks whether frames were extracted 
success = 1
with PoseLandmarker.create_from_options(options) as landmarker:
    while success:

        success, frame = cap.read()
        frame_timestamp_ms = int(cap.get(cv2.CAP_PROP_POS_MSEC))

        mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=frame)
        detection_result =landmarker.detect_for_video(mp_image, frame_timestamp_ms)
        annoatated_iamge = draw_landmarks_on_image(frame, detection_result)
        cv2.imshow('frame', annoatated_iamge)

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

Other info / Complete Logs

No response

tomershukhman avatar Mar 05 '24 00:03 tomershukhman

Hi @tomershukhman,

Could you please confirm whether the issue has been resolved on your end, or if you still require assistance from our end?

Thank you!!

kuaashish avatar May 09 '24 07:05 kuaashish

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions[bot] avatar May 17 '24 01:05 github-actions[bot]

This issue was closed due to lack of activity after being marked stale for past 7 days.

github-actions[bot] avatar May 24 '24 01:05 github-actions[bot]

Are you satisfied with the resolution of your issue? Yes No

google-ml-butler[bot] avatar May 24 '24 01:05 google-ml-butler[bot]

I ran into the same problem. I tried to log all the landmarks, and it seems like the smoothing for the landmarks are kinda too strong. In another word, the landmarks would be way closer to the previous landmark locations instead of the actual correct location. For example, my nose pos for the last five frames was 0.5, 0.5, 0.55, 0.6, 0.6, then the output would be something like 0.502 instead of something closer to 0.6. I'm able to work around this by setting the smooth_landmarks side packet to false, but that leads to a lot of jitter. An initial inspection suggests that reducing the bucket for smoothing the landmarks could solve the problem, but I will need to dig deeper to confirm that. vibe

Froxcey avatar Aug 20 '24 15:08 Froxcey

Okay, I dug a bit deeper, and I found a halfway decent workaround for this problem. Go to mediapipe/modules/pose_landmark/pose_landmark_filtering.pbtxt ln.115 where you will find

one_euro_filter {
  # Min cutoff 0.1 results into ~0.01 alpha in landmark EMA filter
  # when landmark is static.
  min_cutoff: 0.05
  # Beta 80.0 in combintation with min_cutoff 0.05 results into
  # ~0.94 alpha in landmark EMA filter when landmark is moving fast.
  beta: 80.0
  # Derivative cutoff 1.0 results into ~0.17 alpha in landmark
  # velocity EMA filter.
  derivate_cutoff: 1.0
}

After some trial and error, I found this to be the best value for my use case, tweak the values a bit and find the combination you like.

one_euro_filter {
  # Min cutoff 0.1 results into ~0.01 alpha in landmark EMA filter
  # when landmark is static.
  min_cutoff: 0.01
  # Beta 80.0 in combintation with min_cutoff 0.05 results into
  # ~0.94 alpha in landmark EMA filter when landmark is moving fast.
  beta: 1000000.0
  # Derivative cutoff 1.0 results into ~0.17 alpha in landmark
  # velocity EMA filter.
  derivate_cutoff: 5.0
}
vibe

Froxcey avatar Aug 22 '24 10:08 Froxcey