mediapipe
mediapipe copied to clipboard
the live stream python api easily hangs (face landmarks)
Have I written custom code (as opposed to using a stock example script provided in MediaPipe)
Yes
OS Platform and Distribution
Ubuntu 22.04
MediaPipe Tasks SDK version
0.10.13
Task name (e.g. Image classification, Gesture recognition etc.)
face landmarker
Programming Language and version (e.g. C++, Python, Java)
python
Describe the actual behavior
hangs on detector.detect_async()
Describe the expected behaviour
should not hang even if the python callback is not fast, otherwise the live stream api is not useful for real python applications, as it requires doing any work with the received image on yet another thread, which kind of makes a distorted callbacks design for just the benefit of the backpressure.
Standalone code/steps you may have used to try to get what you need
The live stream api gets stuck if even a small amount of processing takes place in the registered python callback
from pathlib import Path
import cv2
import time
import mediapipe as mp
BaseOptions = mp.tasks.BaseOptions
FaceLandmarker = mp.tasks.vision.FaceLandmarker
FaceLandmarkerOptions = mp.tasks.vision.FaceLandmarkerOptions
FaceLandmarkerResult = mp.tasks.vision.FaceLandmarkerResult
VisionRunningMode = mp.tasks.vision.RunningMode
images_output_path = 'face landmarks images'
Path(images_output_path).mkdir(parents=True, exist_ok=True)
timestamps_ms = set()
written_image_number = 0
def handle_pipeline_prediction_callback(
inference: FaceLandmarkerResult,
output_image: mp.Image,
timestamp_ms: int):
time.sleep(0.05)
def main():
# downloaded from 'https://storage.googleapis.com/mediapipe-models/face_landmarker/face_landmarker/float16/1/face_landmarker.task':
model_path = 'models/face_landmarker.task'
options = FaceLandmarkerOptions(
base_options = BaseOptions(model_asset_path = model_path),
running_mode = VisionRunningMode.LIVE_STREAM,
result_callback = handle_pipeline_prediction_callback)
detector = FaceLandmarker.create_from_options(options)
stream = cv2.VideoCapture(0)
# noinspection PyUnresolvedReferences
stream.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*'MJPG'))
stream.set(cv2.CAP_PROP_FPS, 30)
stream.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
stream.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
stream.set(cv2.CAP_PROP_BUFFERSIZE, 1)
if not stream.isOpened():
raise Exception(f'failed opening camera')
image_number = 0
last_image_timestamp = None
try:
while True:
stream.grab()
success, image = stream.retrieve()
if not success:
raise Exception(f'failed retrieving camera image')
image_timestamp = int(stream.get(cv2.CAP_PROP_POS_MSEC))
if last_image_timestamp:
if image_timestamp <= last_image_timestamp:
raise ValueError(f'camera image times are not monotonically increasing: {last_image_timestamp}, {image_timestamp}')
last_image_timestamp = image_timestamp
print(f'pushing image number {image_number} having image timestamp {image_timestamp} to mediapipe ...')
detector.detect_async(
image = mp.Image(image_format=mp.ImageFormat.SRGB, data=image),
timestamp_ms = image_timestamp)
print(f'pushed')
image_number += 1
except KeyboardInterrupt as e:
detector.close()
print(f'\nexiting ...')
if __name__ == '__main__':
main()
Eventually, and quite quickly, it hangs:
pushing image number 291 having image timestamp 54928231 to mediapipe ...
pushed
pushing image number 292 having image timestamp 54928263 to mediapipe ...
pushed
pushing image number 293 having image timestamp 54928295 to mediapipe ...
pushed
pushing image number 294 having image timestamp 54928331 to mediapipe ...
pushed
pushing image number 295 having image timestamp 54928363 to mediapipe ...
Reason to fix this and the linked issue, is that back-pressure isn't useful if it breaks down when it's extended towards the python side, but honestly I'd rather use the non-live api, taking care of concurrency on my own thread and taking care of overall back-pressure on my own, in my specific application.
Hi @matanox,
Apologies for the delayed response. Could you please let us know if this has been resolved on your end, or if you are still seeking a resolution?
Thank you!!
I think it's just safe to avoid the python streaming api, at least, I see no great benefit to using it in python at present time. Will reopen if things change.
Hi @matanox I am also facing this problem. How did you set up your own concurrency so that you didn't have to use the LIVE_VIDEO option? When I try to switch to IMAGE mode and feeding in the images individually, I see the fps of my stream cut in half.
I don't really recall seeing my rate cut in half. What rate are you seeing, and what is your hardware spec and camera model?
Since it's python and we have the GIL, you can do something like acquire the images on a thread, which will slash time you're waiting for the camera to respond (IO). Or switch to a language more naturally concurrent than python. Either way it may imply always running mediapipe inference over the previous frame having been acquired on a thread, so that this minimal concurrency allows camera IO wait times and inference CPU time to happen concurrently.
Much of (though not all) mediapipe processing actually as far as I recall releases the GIL, so the performance gain will also depend on the amount of your own processing that you do the results of the inference or anything else that you do on the loop, if you do.
Think this through, it can help.
I've tried several computers/cameras so I don't think its hardware related. But what software set up should I have? Right now my code is split like this:
- One thread is pulling frames off the webcam and putting them into a var with a threadlock. I also put a time.sleep in here
- Main thread is an infinite loop which takes frames from the var, uses landmarker.detect() on it, and then puts through into my processing function
I get around 15fps, while on my previous LIVE_STREAM setup i would get 30fps but the cameras would crash. Any things I should change to increase my FPS?
A few things you should note when driving it forward:
- Using
time.sleep()goes against any benefit from threading - Can't really guess what "cameras crashing" might mean, but try not to use the cheapest webcams of all
- Of course performance is also hardware related, though it's not your first concern here most likely.
- You should verify the speed the camera thinks it is working at, when your program is lauching, using whatever library api that you are using for camera acquisition, or something like OpenCV. Just to be on the safe side.
Given the first bullet, it looks like you should up your understanding of concurrency in python to get out of this hole. Try to redesign with no use of sleep as the first step ― you can use many resources, and something like ChatGPT to then learn more about ways to use concurrency in python after that ― as these matters are not specific to mediapipe.
Even the official example code somewhere out there should run faster than 15 FPS, you may want to verify that in parallel!