mediapipe
mediapipe copied to clipboard
Pose landmarker detect_for_video runs extremly slow
Have I written custom code (as opposed to using a stock example script provided in MediaPipe)
Yes
OS Platform and Distribution
Macos sanoma
MediaPipe Tasks SDK version
No response
Task name (e.g. Image classification, Gesture recognition etc.)
Pose Landmarker task
Programming Language and version (e.g. C++, Python, Java)
python
Describe the actual behavior
I am trying to implament code that detact poses in a video using tasks.
However when running my code it is if it was in slow mo.
I have tested the code from https://github.com/nicknochnack/MediaPipePoseEstimation/blob/main/Media%20Pipe%20Pose%20Tutorial.ipynb and it with the minor modifcation of cap =cv2.VideoCapture('tennis.mp4')
to make sure apples it's comparing apples (thier code is for streaming) and runs perfectly.
Describe the expected behaviour
For the pose detaction to keep up with the video
Standalone code/steps you may have used to try to get what you need
import cv2
from mediapipe.framework.formats import landmark_pb2
import numpy as np
import mediapipe as mp
model_path = "pose_landmarker.task"
BaseOptions = mp.tasks.BaseOptions
PoseLandmarker = mp.tasks.vision.PoseLandmarker
PoseLandmarkerOptions = mp.tasks.vision.PoseLandmarkerOptions
PoseLandmarkerResult = mp.tasks.vision.PoseLandmarkerResult
VisionRunningMode = mp.tasks.vision.RunningMode
options = PoseLandmarkerOptions(
base_options=BaseOptions(model_asset_path=model_path),
running_mode=VisionRunningMode.VIDEO,
)
def draw_landmarks_on_image(rgb_image, detection_result):
pose_landmarks_list = detection_result.pose_landmarks
# pose_landmarks_list = detection_result
annotated_image = np.copy(rgb_image)
# Loop through the detected poses to visualize.
for idx in range(len(pose_landmarks_list)):
pose_landmarks = pose_landmarks_list[idx]
# Draw the pose landmarks.
pose_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
pose_landmarks_proto.landmark.extend([
landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in pose_landmarks
])
solutions.drawing_utils.draw_landmarks(
annotated_image,
pose_landmarks_proto,
solutions.pose.POSE_CONNECTIONS,
solutions.drawing_styles.get_default_pose_landmarks_style())
return annotated_image
cap = cv2.VideoCapture("tennis.mp4")
from mediapipe import solutions
# Used as counter variable
count = 0
# checks whether frames were extracted
success = 1
with PoseLandmarker.create_from_options(options) as landmarker:
while success:
success, frame = cap.read()
frame_timestamp_ms = int(cap.get(cv2.CAP_PROP_POS_MSEC))
mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=frame)
detection_result =landmarker.detect_for_video(mp_image, frame_timestamp_ms)
annoatated_iamge = draw_landmarks_on_image(frame, detection_result)
cv2.imshow('frame', annoatated_iamge)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
Other info / Complete Logs
No response
Hi @tomershukhman,
Could you please confirm whether the issue has been resolved on your end, or if you still require assistance from our end?
Thank you!!
This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.
This issue was closed due to lack of activity after being marked stale for past 7 days.
I ran into the same problem.
I tried to log all the landmarks, and it seems like the smoothing for the landmarks are kinda too strong. In another word, the landmarks would be way closer to the previous landmark locations instead of the actual correct location.
For example, my nose pos for the last five frames was 0.5, 0.5, 0.55, 0.6, 0.6, then the output would be something like 0.502 instead of something closer to 0.6.
I'm able to work around this by setting the smooth_landmarks
side packet to false, but that leads to a lot of jitter.
An initial inspection suggests that reducing the bucket for smoothing the landmarks could solve the problem, but I will need to dig deeper to confirm that.
Okay, I dug a bit deeper, and I found a halfway decent workaround for this problem. Go to mediapipe/modules/pose_landmark/pose_landmark_filtering.pbtxt ln.115 where you will find
one_euro_filter {
# Min cutoff 0.1 results into ~0.01 alpha in landmark EMA filter
# when landmark is static.
min_cutoff: 0.05
# Beta 80.0 in combintation with min_cutoff 0.05 results into
# ~0.94 alpha in landmark EMA filter when landmark is moving fast.
beta: 80.0
# Derivative cutoff 1.0 results into ~0.17 alpha in landmark
# velocity EMA filter.
derivate_cutoff: 1.0
}
After some trial and error, I found this to be the best value for my use case, tweak the values a bit and find the combination you like.
one_euro_filter {
# Min cutoff 0.1 results into ~0.01 alpha in landmark EMA filter
# when landmark is static.
min_cutoff: 0.01
# Beta 80.0 in combintation with min_cutoff 0.05 results into
# ~0.94 alpha in landmark EMA filter when landmark is moving fast.
beta: 1000000.0
# Derivative cutoff 1.0 results into ~0.17 alpha in landmark
# velocity EMA filter.
derivate_cutoff: 5.0
}