mediapipe
mediapipe copied to clipboard
pose landmark result in linux python has Nan more than window python
Have I written custom code (as opposed to using a stock example script provided in MediaPipe)
Yes
OS Platform and Distribution
linux ubuntu 20.04, window 11, both anaconda, python==3.9.14,
Mobile device if the issue happens on mobile device
No response
Browser and version if the issue happens on browser
No response
Programming Language and version
python
MediaPipe version
window 0.10.10, linux 0.10.10
Bazel version
No response
Solution
Pose Landmark
Android Studio, NDK, SDK versions (if issue is related to building in Android environment)
No response
Xcode & Tulsi version (if issue is related to building for iOS)
No response
Describe the actual behavior
get different inference from window and linux
Describe the expected behaviour
task inference same or nearest Nan count
Standalone code/steps you may have used to try to get what you need
import cv2
import mediapipe as mp
import numpy as np
from mediapipe.tasks import python
from mediapipe.tasks.python import vision
import imageio
def extract_frames_fps(video_path):
frames = []
video_capture = cv2.VideoCapture(video_path)
fps = video_capture.get(cv2.CAP_PROP_FPS)
while video_capture.isOpened():
ret, frame = video_capture.read()
if not ret:
break
frames.append(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
video_capture.release()
return np.array(frames), int(fps)
BaseOptions = mp.tasks.BaseOptions
PoseLandmarker = mp.tasks.vision.PoseLandmarker
PoseLandmarkerOptions = mp.tasks.vision.PoseLandmarkerOptions
VisionRunningMode = mp.tasks.vision.RunningMode
model_path = 'model_path/pose_landmarker_heavy.task'
# Create a pose landmarker instance with the video mode:
options = PoseLandmarkerOptions(
base_options=BaseOptions(model_asset_path=model_path),
running_mode=VisionRunningMode.VIDEO,
min_pose_presence_confidence=0.3,
min_tracking_confidence=0.7,
min_pose_detection_confidence=0.3
)
video_path = '/path/to/my/video.mp4'
video, fps = uf.extract_frames_fps(video_path)
pose_landmarker_results = []
keypoints_frames = []
detect_result_list = []
annotated_frames = []
#create detector
with vision.PoseLandmarker.create_from_options(options) as detector:
for i in range(video.shape[0]):
frame = video[i,:,:,:]
mp_frame = mp.Image(image_format=mp.ImageFormat.SRGB,
data=frame)
detector_result = detector.detect_for_video(mp_frame, timestamp_ms=i*fps )
annotated_frames.append(draw_landmarks_on_image(mp_frame.numpy_view(), detector_result))
result_list = detector_result.pose_world_landmarks
detect_result_list.append(result_list)
keypoints_arr = []
if len(result_list) == 0:
keypoints_arr = np.full((1,33,5), fill_value=np.nan, dtype=np.float32)
for idx in range(len(result_list)):
pose_landmark = result_list[idx]
kps = []
for pl in pose_landmark:
plkps = [pl.x, pl.y, pl.z, pl.visibility, pl.presence]
kps.append(np.array(plkps))
keypoints_arr.append(np.array(kps))
keypoints_frames.append(np.array(keypoints_arr))
keypoints_frames = np.array(keypoints_frames)
print(copy_keypoints_frame[:50, 0, 13,0])
print(np.sum((np.isnan(copy_keypoints_frame))))
Other info / Complete Logs
On Linux python
=======================================================================
[ 0.01166975 -0.01325617 -0.03641914 -0.05470761 -0.07166258 -0.08079658
-0.08318242 -0.08279768 -0.08881 -0.08541096 -0.08233733 -0.05183787
-0.04390267 -0.02972412 -0.01699912 -0.00787161 0.00075246 -0.00227913
-0.00521829 -0.00644223 -0.01679307 -0.03138978 -0.02844721 -0.05424543
-0.07106167 -0.06548407 -0.06702822 -0.06672549 nan -0.19420004
-0.22057012 -0.2284535 -0.19767897 -0.20616581 -0.21181265 -0.21647298
-0.21128778 -0.03720265 -0.07819481 nan 0.00410479 -0.01395571
-0.02828768 -0.03346587 -0.04220413 -0.04611417 -0.0616963 -0.08908865
-0.00920953 -0.02047957]
1815
I0000 00:00:1716812751.904595 306817 task_runner.cc:85] GPU suport is not available: INTERNAL: ; RET_CHECK failure (mediapipe/gpu/gl_context_egl.cc:84) egl_initializedUnable to initialize EGL
=======================================================================
on window11 python
=======================================================================
[ 0.01165697 -0.02807238 -0.04704126 -0.05284797 -0.08323435 -0.08977968
-0.10047425 -0.08969924 -0.0984091 -0.06916987 -0.03707258 -0.02857973
-0.03240457 -0.01915146 -0.01363734 -0.00499233 0.0069503 -0.0032917
-0.00622438 -0.0143904 -0.03090671 -0.04554779 -0.02894793 -0.07363385
-0.06342393 -0.06826069 -0.06399996 -0.08967623 0.10805667 -0.08458084
-0.07507367 -0.00968433 -0.05319788 -0.07317979 -0.07963884 -0.07092766
-0.0653493 -0.06612252 -0.05678153 -0.05748312 -0.04911917 -0.04954029
-0.03681698 -0.04459389 -0.06993935 -0.08679859 -0.09368424 -0.09877644
nan 0.26285872]
660
=======================================================================
sorry for forgot code of video instance initialization update in stand alone code. function extract_frames_fps(video_path)
Hi @waterself,
Could you please confirm whether you are running both operating systems on the same machine or both are separate machines?
Thank you!!
Hello dear @kuaashish,
For specifics my environment , defining Computer A and B: A: Laptop, Windows 11, AMD Ryzen 5 5600H B: Desktop, Linux 20.04, Intel i9-10900X -> No WSL, only operate Linux
So, the above result is operated by different, separate machines.
Thanks for your kindness.
Hi @mbrenon,
Could you please look into this issue?
Thank you!!