mediapipe
mediapipe copied to clipboard
draw_landmarks fails when using HandLandmarker
Have I written custom code (as opposed to using a stock example script provided in MediaPipe)
Yes
OS Platform and Distribution
Mac 14 M1
MediaPipe Tasks SDK version
0.10.11
Task name (e.g. Image classification, Gesture recognition etc.)
HandLandmarker
Programming Language and version (e.g. C++, Python, Java)
python
Describe the actual behavior
Error as it cannot iterate the HandLandmarkerResult
Describe the expected behaviour
Pretty image on screen showing the points detected.
Standalone code/steps you may have used to try to get what you need
import cv2
from mediapipe import Image, ImageFormat
from mediapipe.tasks.python.vision import HandLandmarker, HandLandmarkerOptions, RunningMode
from mediapipe.tasks.python import BaseOptions
from mediapipe.python.solutions.drawing_utils import draw_landmarks
from mediapipe.python.solutions.hands import HAND_CONNECTIONS
base_options = BaseOptions(model_asset_path='hand_landmarker.task')
options = HandLandmarkerOptions(
base_options=base_options,
num_hands=2,
min_hand_detection_confidence=0.1,
min_tracking_confidence=0.1,
running_mode=RunningMode.IMAGE
)
detector = HandLandmarker.create_from_options(options)
# Setup camera capture
cap = cv2.VideoCapture(0)
if not cap.isOpened():
print("Failed to open capture device")
exit(1)
while True:
success, frame = cap.read()
if not success:
break
frame = cv2.flip(frame, 1)
rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
mp_image = Image(image_format=ImageFormat.SRGB, data=rgb_frame)
results = detector.detect(mp_image)
if results and results.hand_landmarks:
for hand_landmarks in results.hand_landmarks:
draw_landmarks(
frame,
hand_landmarks,
HAND_CONNECTIONS
)
cv2.imshow("Frame", frame)
if cv2.waitKey(1) & 0xFF == 27: # Exit on ESC
break
cap.release()
cv2.destroyAllWindows()
Other info / Complete Logs
Error
for idx, landmark in enumerate(landmark_list.landmark):
^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'landmark'
if i edit drawing_utils.py and remove .landmark
then I get
ing_utils.py", line 161, in draw_landmarks
if ((landmark.HasField('visibility') and
^^^^^^^^^^^^^^^^^
AttributeError: 'Landmark' object has no attribute 'HasField'
Hi @ryanm101,
It appears that a similar issue has been reported here https://github.com/google/mediapipe/issues/5266, indicating a possible problem with our package. However, at the moment, we do not have any further information about the issue. Please grant us some time to investigate further, and we will provide you with an update as soon as possible.
Thank you!!
@kinarr Kinar, do you happen to know what the issue could be?
@kinarr Kinar, do you happen to know what the issue could be?
I'll take a look and I believe it has to do with proto3 deprecating HasField().
@ryanm101 The mediapipe visualization utilities API cannot be used directly with the new tasks API as they were originally designed for the legacy solutions.
Here's the Colab notebook from the official MediaPipe samples repo that demonstrates the proper way to reuse those APIs: https://github.com/googlesamples/mediapipe/blob/main/examples/hand_landmarker/python/hand_landmarker.ipynb
You'd have to do something like this
from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2
def draw_landmarks_on_image(rgb_image, detection_result):
hand_landmarks_list = detection_result.hand_landmarks
handedness_list = detection_result.handedness
annotated_image = np.copy(rgb_image)
# Loop through the detected hands to visualize.
for idx in range(len(hand_landmarks_list)):
hand_landmarks = hand_landmarks_list[idx]
# Draw the hand landmarks.
hand_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
hand_landmarks_proto.landmark.extend([
landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in hand_landmarks
])
solutions.drawing_utils.draw_landmarks(
annotated_image,
hand_landmarks_proto,
solutions.hands.HAND_CONNECTIONS,
solutions.drawing_styles.get_default_hand_landmarks_style(),
solutions.drawing_styles.get_default_hand_connections_style())
Hi @ryanm101,
Please review the above comments and provide an update on its status.
Thank you!!
import cv2
from mediapipe import Image, ImageFormat
from mediapipe.tasks.python.vision import HandLandmarker, HandLandmarkerOptions, RunningMode
from mediapipe.tasks.python import BaseOptions
from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2
import numpy as np
# Options for the hand landmarker
base_options = BaseOptions(model_asset_path='hand_landmarker.task') # Ensure this path is correct and points to a .tflite file
options = HandLandmarkerOptions(
base_options=base_options,
num_hands=2,
min_hand_detection_confidence=0.1,
min_tracking_confidence=0.1,
running_mode=RunningMode.IMAGE
)
detector = HandLandmarker.create_from_options(options)
def draw_landmarks_on_image(rgb_image, detection_result):
hand_landmarks_list = detection_result.hand_landmarks
handedness_list = detection_result.handedness
annotated_image = np.copy(rgb_image)
# Loop through the detected hands to visualize.
for idx in range(len(hand_landmarks_list)):
hand_landmarks = hand_landmarks_list[idx]
# Draw the hand landmarks.
hand_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
hand_landmarks_proto.landmark.extend([
landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in hand_landmarks
])
solutions.drawing_utils.draw_landmarks(
annotated_image,
hand_landmarks_proto,
solutions.hands.HAND_CONNECTIONS,
solutions.drawing_styles.get_default_hand_landmarks_style(),
solutions.drawing_styles.get_default_hand_connections_style())
return annotated_image
# Setup camera capture
cap = cv2.VideoCapture(0)
if not cap.isOpened():
print("Failed to open capture device")
exit(1)
while True:
success, frame = cap.read()
if not success:
break
frame = cv2.flip(frame, 1)
rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# Process the frame using HandLandmarker
mp_image = Image(image_format=ImageFormat.SRGB, data=rgb_frame)
results = detector.detect(mp_image)
# Draw the hand landmarks on the frame
if results:
annotated_image = draw_landmarks_on_image(rgb_frame, results)
cv2.imshow("Frame", annotated_image)
if cv2.waitKey(1) & 0xFF == 27: # Exit on ESC
break
cap.release()
cv2.destroyAllWindows()
Results in a crash:
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1714722484.764678 8495379 gl_context.cc:357] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M1
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
2024-05-03 08:48:06.252 Python[30375:8495379] WARNING: Secure coding is not enabled for restorable state! Enable secure coding by implementing NSApplicationDelegate.applicationSupportsSecureRestorableState: and returning YES.
venv/lib/python3.11/site-packages/google/protobuf/symbol_database.py:55: UserWarning: SymbolDatabase.GetPrototype() is deprecated. Please use message_factory.GetMessageClass() instead. SymbolDatabase.GetPrototype() will be removed soon.
warnings.warn('SymbolDatabase.GetPrototype() is deprecated. Please '
Traceback (most recent call last):
File "handtracking.py", line 64, in <module>
annotated_image = draw_landmarks_on_image(rgb_frame, results)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "handtracking.py", line 35, in draw_landmarks_on_image
solutions.drawing_utils.draw_landmarks(
File "venv/lib/python3.11/site-packages/mediapipe/python/solutions/drawing_utils.py", line 160, in draw_landmarks
for idx, landmark in enumerate(landmark_list):
^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NormalizedLandmarkList' object is not iterable
@ryanm101 I've created a notebook for you to use here: https://colab.research.google.com/gist/kinarr/8004c6cbc5b61f57603a05a531b8f365/colab-hand-landmarker-test-on-video.ipynb
@kinarr are you not seeing the same result as me?
I mean if i use the legacy hands solution with draw my system tracks the hand and displays the 'annotation' on the video stream from the webcam..
If i try to do the same thing with the code above i get the crash, between your code and mine i'm not seeing any difference except you are using a video file and outputing a static image, whereas i want to show the video running with the frames annotated
@ryanm101 It does work in a real-time camera capture stream too. I just created the Colab to show that it does indeed work either way. But you can swap out the video specific code with your code.
# From
cap = cv2.VideoCapture("video.mp4")
# To
cap = cv2.VideoCapture(0)
…
# From
while frame_count < NUM_FRAMES:
# To
while True:
# From
cv2_imshow(annotated_image)
# To
cv2.imshow("Frame", annotated_image)
…
In the end it's the same logic as you'd be running inference on a static image frame and the only difference would be to use a different video source (a camera in your case).
I can get on my M1 and test the same but it should be pretty much similar to whatever's in Colab.
Here's the full code after testing it on an M1
import cv2
import numpy as np
from mediapipe import Image, ImageFormat
from mediapipe.tasks.python.vision import HandLandmarker, HandLandmarkerOptions, RunningMode
from mediapipe.tasks.python import BaseOptions
from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2
def draw_landmarks_on_image(rgb_image, detection_result):
hand_landmarks_list = detection_result.hand_landmarks
annotated_image = np.copy(rgb_image)
# Loop through the detected hands to visualize.
for idx in range(len(hand_landmarks_list)):
hand_landmarks = hand_landmarks_list[idx]
# Draw the hand landmarks.
hand_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
hand_landmarks_proto.landmark.extend([
landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in hand_landmarks
])
solutions.drawing_utils.draw_landmarks(
annotated_image,
hand_landmarks_proto,
solutions.hands.HAND_CONNECTIONS,
solutions.drawing_styles.get_default_hand_landmarks_style(),
solutions.drawing_styles.get_default_hand_connections_style())
return annotated_image
# Options for the hand landmarker
base_options = BaseOptions(model_asset_path='hand_landmarker.task') # Ensure this path is correct and points to a .tflite file
options = HandLandmarkerOptions(
base_options=base_options,
num_hands=2,
min_hand_detection_confidence=0.1,
min_tracking_confidence=0.1,
running_mode=RunningMode.IMAGE
)
detector = HandLandmarker.create_from_options(options)
# Setup camera capture
cap = cv2.VideoCapture(0)
if not cap.isOpened():
print("Failed to open capture device")
exit(1)
# Run inference on the video
print("Running hand landmarker...")
while True:
success, frame = cap.read()
if not success:
break
frame = cv2.flip(frame, 1)
rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# Process the frame using HandLandmarker
mp_image = Image(image_format=ImageFormat.SRGB, data=rgb_frame)
results = detector.detect(mp_image)
# Draw the hand landmarks on the frame
if results:
annotated_image = draw_landmarks_on_image(mp_image.numpy_view(), results)
bgr_frame = cv2.cvtColor(annotated_image, cv2.COLOR_RGB2BGR)
cv2.imshow("Frame", bgr_frame)
if cv2.waitKey(1) & 0xFF == 27: # Exit on ESC
break
cap.release()
cv2.destroyAllWindows()
really odd.. I copied and pasted your code and:
Traceback (most recent call last):
File "handtracking.py", line 70, in <module>
annotated_image = draw_landmarks_on_image(mp_image.numpy_view(), results)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "handtracking.py", line 24, in draw_landmarks_on_image
solutions.drawing_utils.draw_landmarks(
File "venv/lib/python3.11/site-packages/mediapipe/python/solutions/drawing_utils.py", line 160, in draw_landmarks
for idx, landmark in enumerate(landmark_list):
^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NormalizedLandmarkList' object is not iterable
Installed Packages:
absl-py==2.1.0
attrs==23.2.0
cffi==1.16.0
contourpy==1.2.1
cycler==0.12.1
filelock==3.13.4
flatbuffers==24.3.25
fonttools==4.51.0
fsspec==2024.3.1
jax==0.4.26
jaxlib==0.4.26
Jinja2==3.1.3
kiwisolver==1.4.5
MarkupSafe==2.1.5
matplotlib==3.8.4
mediapipe==0.10.11
ml-dtypes==0.4.0
mpmath==1.3.0
networkx==3.3
numpy==1.26.4
opencv-contrib-python==4.9.0.80
opencv-python==4.9.0.80
opt-einsum==3.3.0
packaging==24.0
pillow==10.3.0
protobuf==5.26.1
pycparser==2.22
pyparsing==3.1.2
python-dateutil==2.9.0.post0
scipy==1.13.0
six==1.16.0
sounddevice==0.4.6
sympy==1.12
torch==2.3.0
typing_extensions==4.11.0
I get a picture up fine, but as soon as it detects my hand it crashes with the above error
Ok really strange..
I rolled back to 0.10.9 with protobuf 3 and your code worked. I then upgraded to 0.10.10 and upgraded to protobuf 5 and it worked Lastly upgraded to 0.10.11 and upgraded to protobuf 5 and it worked
absl-py==2.1.0
attrs==23.2.0
cffi==1.16.0
contourpy==1.2.1
cycler==0.12.1
filelock==3.13.4
flatbuffers==24.3.25
fonttools==4.51.0
fsspec==2024.3.1
jax==0.4.26
jaxlib==0.4.26
Jinja2==3.1.3
kiwisolver==1.4.5
MarkupSafe==2.1.5
matplotlib==3.8.4
mediapipe==0.10.11
ml-dtypes==0.4.0
mpmath==1.3.0
networkx==3.3
numpy==1.26.4
opencv-contrib-python==4.9.0.80
opencv-python==4.9.0.80
opt-einsum==3.3.0
packaging==24.0
pillow==10.3.0
protobuf==5.26.1
pycparser==2.22
pyparsing==3.1.2
python-dateutil==2.9.0.post0
scipy==1.13.0
six==1.16.0
sounddevice==0.4.6
sympy==1.12
torch==2.3.0
typing_extensions==4.11.0
seems to work now..
Thanks for all your help.
You're welcome and yes there seems to be a protobuf version issue. It should be eventually resolved by the MediaPipe team.