mediapipe icon indicating copy to clipboard operation
mediapipe copied to clipboard

draw_landmarks fails when using HandLandmarker

Open ryanm101 opened this issue 1 year ago • 11 comments

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

Mac 14 M1

MediaPipe Tasks SDK version

0.10.11

Task name (e.g. Image classification, Gesture recognition etc.)

HandLandmarker

Programming Language and version (e.g. C++, Python, Java)

python

Describe the actual behavior

Error as it cannot iterate the HandLandmarkerResult

Describe the expected behaviour

Pretty image on screen showing the points detected.

Standalone code/steps you may have used to try to get what you need

import cv2
from mediapipe import Image, ImageFormat
from mediapipe.tasks.python.vision import HandLandmarker, HandLandmarkerOptions, RunningMode
from mediapipe.tasks.python import BaseOptions
from mediapipe.python.solutions.drawing_utils import draw_landmarks
from mediapipe.python.solutions.hands import HAND_CONNECTIONS

base_options = BaseOptions(model_asset_path='hand_landmarker.task')
options = HandLandmarkerOptions(
    base_options=base_options,
    num_hands=2,
    min_hand_detection_confidence=0.1,
    min_tracking_confidence=0.1,
    running_mode=RunningMode.IMAGE
)
detector = HandLandmarker.create_from_options(options)

# Setup camera capture
cap = cv2.VideoCapture(0)
if not cap.isOpened():
    print("Failed to open capture device")
    exit(1)

while True:
    success, frame = cap.read()
    if not success:
        break

    frame = cv2.flip(frame, 1)
    rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    mp_image = Image(image_format=ImageFormat.SRGB, data=rgb_frame)
    results = detector.detect(mp_image)

    if results and results.hand_landmarks:
        for hand_landmarks in results.hand_landmarks:
            draw_landmarks(
                frame,
                hand_landmarks,
                HAND_CONNECTIONS
            )

    cv2.imshow("Frame", frame)
    if cv2.waitKey(1) & 0xFF == 27:  # Exit on ESC
        break

cap.release()
cv2.destroyAllWindows()

Other info / Complete Logs

Error

    for idx, landmark in enumerate(landmark_list.landmark):
                                   ^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'landmark'

if i edit drawing_utils.py and remove .landmark then I get

ing_utils.py", line 161, in draw_landmarks
    if ((landmark.HasField('visibility') and
         ^^^^^^^^^^^^^^^^^
AttributeError: 'Landmark' object has no attribute 'HasField'

ryanm101 avatar Apr 29 '24 13:04 ryanm101

Hi @ryanm101,

It appears that a similar issue has been reported here https://github.com/google/mediapipe/issues/5266, indicating a possible problem with our package. However, at the moment, we do not have any further information about the issue. Please grant us some time to investigate further, and we will provide you with an update as soon as possible.

Thank you!!

kuaashish avatar Apr 30 '24 08:04 kuaashish

@kinarr Kinar, do you happen to know what the issue could be?

schmidt-sebastian avatar May 02 '24 15:05 schmidt-sebastian

@kinarr Kinar, do you happen to know what the issue could be?

I'll take a look and I believe it has to do with proto3 deprecating HasField().

kinarr avatar May 02 '24 18:05 kinarr

@ryanm101 The mediapipe visualization utilities API cannot be used directly with the new tasks API as they were originally designed for the legacy solutions.

Here's the Colab notebook from the official MediaPipe samples repo that demonstrates the proper way to reuse those APIs: https://github.com/googlesamples/mediapipe/blob/main/examples/hand_landmarker/python/hand_landmarker.ipynb

kinarr avatar May 02 '24 18:05 kinarr

You'd have to do something like this

from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2

def draw_landmarks_on_image(rgb_image, detection_result):
  hand_landmarks_list = detection_result.hand_landmarks
  handedness_list = detection_result.handedness
  annotated_image = np.copy(rgb_image)

  # Loop through the detected hands to visualize.
  for idx in range(len(hand_landmarks_list)):
    hand_landmarks = hand_landmarks_list[idx]

    # Draw the hand landmarks.
    hand_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
    hand_landmarks_proto.landmark.extend([
      landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in hand_landmarks
    ])
    solutions.drawing_utils.draw_landmarks(
      annotated_image,
      hand_landmarks_proto,
      solutions.hands.HAND_CONNECTIONS,
      solutions.drawing_styles.get_default_hand_landmarks_style(),
      solutions.drawing_styles.get_default_hand_connections_style())

kinarr avatar May 02 '24 18:05 kinarr

Hi @ryanm101,

Please review the above comments and provide an update on its status.

Thank you!!

kuaashish avatar May 03 '24 05:05 kuaashish

import cv2
from mediapipe import Image, ImageFormat
from mediapipe.tasks.python.vision import HandLandmarker, HandLandmarkerOptions, RunningMode
from mediapipe.tasks.python import BaseOptions
from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2
import numpy as np


# Options for the hand landmarker
base_options = BaseOptions(model_asset_path='hand_landmarker.task')  # Ensure this path is correct and points to a .tflite file
options = HandLandmarkerOptions(
    base_options=base_options,
    num_hands=2,
    min_hand_detection_confidence=0.1,
    min_tracking_confidence=0.1,
    running_mode=RunningMode.IMAGE
)
detector = HandLandmarker.create_from_options(options)

def draw_landmarks_on_image(rgb_image, detection_result):
  hand_landmarks_list = detection_result.hand_landmarks
  handedness_list = detection_result.handedness
  annotated_image = np.copy(rgb_image)

  # Loop through the detected hands to visualize.
  for idx in range(len(hand_landmarks_list)):
    hand_landmarks = hand_landmarks_list[idx]

    # Draw the hand landmarks.
    hand_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
    hand_landmarks_proto.landmark.extend([
      landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in hand_landmarks
    ])
    solutions.drawing_utils.draw_landmarks(
      annotated_image,
      hand_landmarks_proto,
      solutions.hands.HAND_CONNECTIONS,
      solutions.drawing_styles.get_default_hand_landmarks_style(),
      solutions.drawing_styles.get_default_hand_connections_style())

  return annotated_image

# Setup camera capture
cap = cv2.VideoCapture(0)
if not cap.isOpened():
    print("Failed to open capture device")
    exit(1)

while True:
    success, frame = cap.read()
    if not success:
        break

    frame = cv2.flip(frame, 1)
    rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # Process the frame using HandLandmarker
    mp_image = Image(image_format=ImageFormat.SRGB, data=rgb_frame)
    results = detector.detect(mp_image)

    # Draw the hand landmarks on the frame
    if results:
        annotated_image = draw_landmarks_on_image(rgb_frame, results)
    cv2.imshow("Frame", annotated_image)
    if cv2.waitKey(1) & 0xFF == 27:  # Exit on ESC
        break

cap.release()
cv2.destroyAllWindows()

Results in a crash:

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1714722484.764678 8495379 gl_context.cc:357] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M1
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
2024-05-03 08:48:06.252 Python[30375:8495379] WARNING: Secure coding is not enabled for restorable state! Enable secure coding by implementing NSApplicationDelegate.applicationSupportsSecureRestorableState: and returning YES.
venv/lib/python3.11/site-packages/google/protobuf/symbol_database.py:55: UserWarning: SymbolDatabase.GetPrototype() is deprecated. Please use message_factory.GetMessageClass() instead. SymbolDatabase.GetPrototype() will be removed soon.
  warnings.warn('SymbolDatabase.GetPrototype() is deprecated. Please '
Traceback (most recent call last):
  File "handtracking.py", line 64, in <module>
    annotated_image = draw_landmarks_on_image(rgb_frame, results)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "handtracking.py", line 35, in draw_landmarks_on_image
    solutions.drawing_utils.draw_landmarks(
  File "venv/lib/python3.11/site-packages/mediapipe/python/solutions/drawing_utils.py", line 160, in draw_landmarks
    for idx, landmark in enumerate(landmark_list):
                         ^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NormalizedLandmarkList' object is not iterable

ryanm101 avatar May 03 '24 07:05 ryanm101

@ryanm101 I've created a notebook for you to use here: https://colab.research.google.com/gist/kinarr/8004c6cbc5b61f57603a05a531b8f365/colab-hand-landmarker-test-on-video.ipynb

kinarr avatar May 03 '24 11:05 kinarr

@kinarr are you not seeing the same result as me?

I mean if i use the legacy hands solution with draw my system tracks the hand and displays the 'annotation' on the video stream from the webcam..

If i try to do the same thing with the code above i get the crash, between your code and mine i'm not seeing any difference except you are using a video file and outputing a static image, whereas i want to show the video running with the frames annotated

ryanm101 avatar May 03 '24 12:05 ryanm101

@ryanm101 It does work in a real-time camera capture stream too. I just created the Colab to show that it does indeed work either way. But you can swap out the video specific code with your code.

# From 
cap = cv2.VideoCapture("video.mp4")
# To
cap = cv2.VideoCapture(0)
…

# From
while frame_count < NUM_FRAMES:
# To
while True:

# From
cv2_imshow(annotated_image)
# To
cv2.imshow("Frame", annotated_image)
…

In the end it's the same logic as you'd be running inference on a static image frame and the only difference would be to use a different video source (a camera in your case).

I can get on my M1 and test the same but it should be pretty much similar to whatever's in Colab.

kinarr avatar May 03 '24 12:05 kinarr

Here's the full code after testing it on an M1

import cv2
import numpy as np

from mediapipe import Image, ImageFormat
from mediapipe.tasks.python.vision import HandLandmarker, HandLandmarkerOptions, RunningMode
from mediapipe.tasks.python import BaseOptions
from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2


def draw_landmarks_on_image(rgb_image, detection_result):
  hand_landmarks_list = detection_result.hand_landmarks
  annotated_image = np.copy(rgb_image)

  # Loop through the detected hands to visualize.
  for idx in range(len(hand_landmarks_list)):
    hand_landmarks = hand_landmarks_list[idx]

    # Draw the hand landmarks.
    hand_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
    hand_landmarks_proto.landmark.extend([
      landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in hand_landmarks
    ])
    solutions.drawing_utils.draw_landmarks(
      annotated_image,
      hand_landmarks_proto,
      solutions.hands.HAND_CONNECTIONS,
      solutions.drawing_styles.get_default_hand_landmarks_style(),
      solutions.drawing_styles.get_default_hand_connections_style())

  return annotated_image


# Options for the hand landmarker
base_options = BaseOptions(model_asset_path='hand_landmarker.task')  # Ensure this path is correct and points to a .tflite file
options = HandLandmarkerOptions(
    base_options=base_options,
    num_hands=2,
    min_hand_detection_confidence=0.1,
    min_tracking_confidence=0.1,
    running_mode=RunningMode.IMAGE
)
detector = HandLandmarker.create_from_options(options)

# Setup camera capture
cap = cv2.VideoCapture(0)

if not cap.isOpened():
    print("Failed to open capture device")
    exit(1)

# Run inference on the video
print("Running hand landmarker...")

while True:
    success, frame = cap.read()

    if not success:
        break

    frame = cv2.flip(frame, 1)
    rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # Process the frame using HandLandmarker
    mp_image = Image(image_format=ImageFormat.SRGB, data=rgb_frame)
    results = detector.detect(mp_image)

    # Draw the hand landmarks on the frame
    if results:
        annotated_image = draw_landmarks_on_image(mp_image.numpy_view(), results)
        bgr_frame = cv2.cvtColor(annotated_image, cv2.COLOR_RGB2BGR)
        cv2.imshow("Frame", bgr_frame)

    if cv2.waitKey(1) & 0xFF == 27:  # Exit on ESC
        break

cap.release()
cv2.destroyAllWindows()

kinarr avatar May 03 '24 13:05 kinarr

really odd.. I copied and pasted your code and:

Traceback (most recent call last):
  File "handtracking.py", line 70, in <module>
    annotated_image = draw_landmarks_on_image(mp_image.numpy_view(), results)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "handtracking.py", line 24, in draw_landmarks_on_image
    solutions.drawing_utils.draw_landmarks(
  File "venv/lib/python3.11/site-packages/mediapipe/python/solutions/drawing_utils.py", line 160, in draw_landmarks
    for idx, landmark in enumerate(landmark_list):
                         ^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NormalizedLandmarkList' object is not iterable

Installed Packages:

absl-py==2.1.0
attrs==23.2.0
cffi==1.16.0
contourpy==1.2.1
cycler==0.12.1
filelock==3.13.4
flatbuffers==24.3.25
fonttools==4.51.0
fsspec==2024.3.1
jax==0.4.26
jaxlib==0.4.26
Jinja2==3.1.3
kiwisolver==1.4.5
MarkupSafe==2.1.5
matplotlib==3.8.4
mediapipe==0.10.11
ml-dtypes==0.4.0
mpmath==1.3.0
networkx==3.3
numpy==1.26.4
opencv-contrib-python==4.9.0.80
opencv-python==4.9.0.80
opt-einsum==3.3.0
packaging==24.0
pillow==10.3.0
protobuf==5.26.1
pycparser==2.22
pyparsing==3.1.2
python-dateutil==2.9.0.post0
scipy==1.13.0
six==1.16.0
sounddevice==0.4.6
sympy==1.12
torch==2.3.0
typing_extensions==4.11.0

I get a picture up fine, but as soon as it detects my hand it crashes with the above error

ryanm101 avatar May 03 '24 23:05 ryanm101

Ok really strange..

I rolled back to 0.10.9 with protobuf 3 and your code worked. I then upgraded to 0.10.10 and upgraded to protobuf 5 and it worked Lastly upgraded to 0.10.11 and upgraded to protobuf 5 and it worked

absl-py==2.1.0
attrs==23.2.0
cffi==1.16.0
contourpy==1.2.1
cycler==0.12.1
filelock==3.13.4
flatbuffers==24.3.25
fonttools==4.51.0
fsspec==2024.3.1
jax==0.4.26
jaxlib==0.4.26
Jinja2==3.1.3
kiwisolver==1.4.5
MarkupSafe==2.1.5
matplotlib==3.8.4
mediapipe==0.10.11
ml-dtypes==0.4.0
mpmath==1.3.0
networkx==3.3
numpy==1.26.4
opencv-contrib-python==4.9.0.80
opencv-python==4.9.0.80
opt-einsum==3.3.0
packaging==24.0
pillow==10.3.0
protobuf==5.26.1
pycparser==2.22
pyparsing==3.1.2
python-dateutil==2.9.0.post0
scipy==1.13.0
six==1.16.0
sounddevice==0.4.6
sympy==1.12
torch==2.3.0
typing_extensions==4.11.0

seems to work now..

Thanks for all your help.

ryanm101 avatar May 03 '24 23:05 ryanm101

You're welcome and yes there seems to be a protobuf version issue. It should be eventually resolved by the MediaPipe team.

kinarr avatar May 04 '24 00:05 kinarr

Are you satisfied with the resolution of your issue? Yes No

google-ml-butler[bot] avatar May 04 '24 11:05 google-ml-butler[bot]