mediapipe icon indicating copy to clipboard operation
mediapipe copied to clipboard

It is incorrect of using pose_landmark_full.tflite inference

Open zhenhao-huang opened this issue 10 months ago • 7 comments

ubuntu20,mediapipe=0.10.11 task:pose landmark detection problem:The inference results of tflite and Python APIs are inconsistent expected:The inference results of tflite and Python APIs should be consistent

import cv2
import mediapipe as mp
import tensorflow as tf
import numpy as np

mp_pose = mp.solutions.pose

interpreter = tf.lite.Interpreter("pose_landmark_full.tflite")
interpreter.allocate_tensors()

img = cv2.imread('img.jpg')
rgb_img = cv2.resize(img, (256, 256))
rgb_img = cv2.cvtColor(rgb_img, cv2.COLOR_BGR2RGB)
height, width = img.shape[:2]

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
print('input_details: ', input_details)
print('output_details: ', output_details)

rgb_img = np.expand_dims(rgb_img, 0)
input = (rgb_img / 255).astype(np.float32)

interpreter.set_tensor(input_details[0]['index'], input)
interpreter.invoke()
output = interpreter.get_tensor(output_details[0]['index'])
output = np.reshape(output, (1, 1, 39, 5))

kp = list()
for i in output[0][0]:
    kp.append(i.tolist())

for i in kp:
    x, y = int(i[0] / 256 * width), int(i[1] / 256 * height)
    cv2.circle(img, (x, y), 0, (255, 255, 255), 10)

kp = np.array(kp)
for _c in mp_pose.POSE_CONNECTIONS:
    x1, y1, x2, y2 = int(kp[_c[0], 0] / 256 * width), int(kp[_c[0], 1] / 256 * height), int(kp[_c[1], 0] / 256 * width), int(kp[_c[1], 1] / 256 * height)
    cv2.line(img, (x1, y1), (x2, y2), (0, 0, 255), 3)

cv2.namedWindow("MediaPipe Pose", 0)
cv2.resizeWindow('MediaPipe Pose', 1200, 600)
cv2.imshow('MediaPipe Pose', img)
cv2.waitKey(0)

image: img3 my result: test python api: test2

zhenhao-huang avatar Apr 20 '24 08:04 zhenhao-huang

Hi @zhenhao-huang,

Currently, you are using the legacy pose solution, it has been upgraded and is now part of the Pose Landmarker Task API, offering enhanced capabilities. Please upgrade to the new API. Find documentation here and a Python implementation guide here.

For Legacy Solutions, the libraries, documentation, and source code will remain available on GitHub and through services like Maven and NPM. However, support for upgraded solutions has been discontinued. Refer to our documentation for more information. If you face issues with the new Pose Landmarker Task API implementation, report them here for assistance.

Thank you!!

kuaashish avatar Apr 22 '24 06:04 kuaashish

The tflite model is come from the lastest python api. I can't only use the tflite model reproducing result of python api.

zhenhao-huang avatar Apr 22 '24 06:04 zhenhao-huang

Hi @zhenhao-huang,

We currently do not have a tflite model available for the upgraded Pose Landmarker. However, we offer multiple Task APIs for the Pose Landmarker, which you can access at the following link: Pose Landmarker Task APIs.

you are utilizing the legacy Pose solution, which is no longer supported. Therefore, we will not be able to provide further assistance for it. Instead, we recommend transitioning to our new Pose Landmarker Task API as previously suggested. If you encounter any challenges during this transition, please report them to us, and we will gladly assist you.

Thank you!!

kuaashish avatar Apr 22 '24 06:04 kuaashish

Yeah,i use link task file.The tflite model come from by unzipping pose_landmarker_full.task.

zhenhao-huang avatar Apr 22 '24 07:04 zhenhao-huang

Hi @zhenhao-huang,

This will be one of our feature in near the future. At present, Currently, we strongly advise against unzipping the task file and using the tflite model directly. This approach has not undergone testing and may lead to inaccurate inferences as it lacks the capability to handle various scenarios. Instead, we recommend utilizing the new Pose Landmarker Task API, for which a code example is provided below.

import mediapipe as mp

BaseOptions = mp.tasks.BaseOptions
PoseLandmarker = mp.tasks.vision.PoseLandmarker
PoseLandmarkerOptions = mp.tasks.vision.PoseLandmarkerOptions
VisionRunningMode = mp.tasks.vision.RunningMode

options = PoseLandmarkerOptions(
    base_options=BaseOptions(model_asset_path=model_path),
    running_mode=VisionRunningMode.IMAGE)

with PoseLandmarker.create_from_options(options) as landmarker:

The same example is also available in our documentation here.

Thank you!!

kuaashish avatar Apr 22 '24 09:04 kuaashish

Actually, I want to convert tflite to rknn and deploy it on NPU.

zhenhao-huang avatar Apr 23 '24 01:04 zhenhao-huang

Is the task api available from C++? The documentation seems to imply that it is not, however if the task api is not available to C++ and the models are no longer being supported does that mean that C++ support is being removed?

gkoreman avatar Apr 23 '24 04:04 gkoreman

Are you satisfied with the resolution of your issue? Yes No

google-ml-butler[bot] avatar May 10 '24 11:05 google-ml-butler[bot]

Hello, have you solved this problem? Can you successfully convert the model and deploy it while ensuring accuracy

nfjdfb avatar May 11 '24 12:05 nfjdfb

C++ API is unfriendly and deployment is hard.The results of the new Python API on video are not as good as the old API, it is shake.

zhenhao-huang avatar May 25 '24 02:05 zhenhao-huang

Yeah, TFLite model would be great! In my case, E.g. It's inconvenient and inefficient to transform PoseLandmarkResult to tensor/numpy array just to pass it to another model. Besides, @kuaashish are you planning on releasing TF layers model for pose blazepose_3d as well?

PLtier avatar Jun 03 '24 14:06 PLtier