mediapipe pose landmark result in linux python has Nan more than window python

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

linux ubuntu 20.04, window 11, both anaconda, python==3.9.14,

Mobile device if the issue happens on mobile device

No response

Browser and version if the issue happens on browser

No response

Programming Language and version

python

MediaPipe version

window 0.10.10, linux 0.10.10

Bazel version

No response

Solution

Pose Landmark

Android Studio, NDK, SDK versions (if issue is related to building in Android environment)

No response

Xcode & Tulsi version (if issue is related to building for iOS)

No response

Describe the actual behavior

get different inference from window and linux

Describe the expected behaviour

task inference same or nearest Nan count

Standalone code/steps you may have used to try to get what you need


import cv2
import mediapipe as mp
import numpy as np
from mediapipe.tasks import python
from mediapipe.tasks.python import vision
import imageio

def extract_frames_fps(video_path):
    frames = []
    video_capture = cv2.VideoCapture(video_path)
    fps = video_capture.get(cv2.CAP_PROP_FPS)

    while video_capture.isOpened():
        ret, frame = video_capture.read()
        if not ret:
            break
        
        frames.append(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))

    video_capture.release()

    return np.array(frames), int(fps) 

BaseOptions = mp.tasks.BaseOptions
PoseLandmarker = mp.tasks.vision.PoseLandmarker
PoseLandmarkerOptions = mp.tasks.vision.PoseLandmarkerOptions
VisionRunningMode = mp.tasks.vision.RunningMode

model_path = 'model_path/pose_landmarker_heavy.task'

# Create a pose landmarker instance with the video mode:
options = PoseLandmarkerOptions(
    base_options=BaseOptions(model_asset_path=model_path),
    running_mode=VisionRunningMode.VIDEO,
    min_pose_presence_confidence=0.3,
    min_tracking_confidence=0.7,
    min_pose_detection_confidence=0.3
    )

video_path = '/path/to/my/video.mp4'

video, fps = uf.extract_frames_fps(video_path)

pose_landmarker_results = []
keypoints_frames = []
detect_result_list = []
annotated_frames = []

#create detector
with vision.PoseLandmarker.create_from_options(options) as detector:
    for i in range(video.shape[0]):
        frame = video[i,:,:,:]

        mp_frame = mp.Image(image_format=mp.ImageFormat.SRGB,
                  data=frame)
        
        detector_result = detector.detect_for_video(mp_frame, timestamp_ms=i*fps )

        annotated_frames.append(draw_landmarks_on_image(mp_frame.numpy_view(), detector_result))
        
        result_list = detector_result.pose_world_landmarks
        detect_result_list.append(result_list)
        keypoints_arr = []
        if len(result_list) == 0:
            keypoints_arr = np.full((1,33,5), fill_value=np.nan, dtype=np.float32)
        for idx in range(len(result_list)):
            pose_landmark = result_list[idx]
            kps = []
            for pl in pose_landmark:
                plkps = [pl.x, pl.y, pl.z, pl.visibility, pl.presence]
                kps.append(np.array(plkps))
                
                
            keypoints_arr.append(np.array(kps))
        keypoints_frames.append(np.array(keypoints_arr))
keypoints_frames = np.array(keypoints_frames)

print(copy_keypoints_frame[:50, 0, 13,0])
print(np.sum((np.isnan(copy_keypoints_frame))))

Other info / Complete Logs

On Linux python
=======================================================================

[ 0.01166975 -0.01325617 -0.03641914 -0.05470761 -0.07166258 -0.08079658
 -0.08318242 -0.08279768 -0.08881    -0.08541096 -0.08233733 -0.05183787
 -0.04390267 -0.02972412 -0.01699912 -0.00787161  0.00075246 -0.00227913
 -0.00521829 -0.00644223 -0.01679307 -0.03138978 -0.02844721 -0.05424543
 -0.07106167 -0.06548407 -0.06702822 -0.06672549         nan -0.19420004
 -0.22057012 -0.2284535  -0.19767897 -0.20616581 -0.21181265 -0.21647298
 -0.21128778 -0.03720265 -0.07819481         nan  0.00410479 -0.01395571
 -0.02828768 -0.03346587 -0.04220413 -0.04611417 -0.0616963  -0.08908865
 -0.00920953 -0.02047957]
1815
I0000 00:00:1716812751.904595  306817 task_runner.cc:85] GPU suport is not available: INTERNAL: ; RET_CHECK failure (mediapipe/gpu/gl_context_egl.cc:84) egl_initializedUnable to initialize EGL
=======================================================================

on window11 python
=======================================================================
[ 0.01165697 -0.02807238 -0.04704126 -0.05284797 -0.08323435 -0.08977968
 -0.10047425 -0.08969924 -0.0984091  -0.06916987 -0.03707258 -0.02857973
 -0.03240457 -0.01915146 -0.01363734 -0.00499233  0.0069503  -0.0032917
 -0.00622438 -0.0143904  -0.03090671 -0.04554779 -0.02894793 -0.07363385
 -0.06342393 -0.06826069 -0.06399996 -0.08967623  0.10805667 -0.08458084
 -0.07507367 -0.00968433 -0.05319788 -0.07317979 -0.07963884 -0.07092766
 -0.0653493  -0.06612252 -0.05678153 -0.05748312 -0.04911917 -0.04954029
 -0.03681698 -0.04459389 -0.06993935 -0.08679859 -0.09368424 -0.09877644
         nan  0.26285872]
660
=======================================================================

May 27 '24 12:05 snwnkim

sorry for forgot code of video instance initialization update in stand alone code. function extract_frames_fps(video_path)

May 28 '24 01:05 snwnkim

Hi @waterself,

Could you please confirm whether you are running both operating systems on the same machine or both are separate machines?

Thank you!!

May 30 '24 08:05 kuaashish

Hello dear @kuaashish,

For specifics my environment , defining Computer A and B: A: Laptop, Windows 11, AMD Ryzen 5 5600H B: Desktop, Linux 20.04, Intel i9-10900X -> No WSL, only operate Linux

So, the above result is operated by different, separate machines.

Thanks for your kindness.

May 30 '24 11:05 snwnkim

Hi @mbrenon,

Could you please look into this issue?

Thank you!!

May 30 '24 12:05 kuaashish

mediapipe mediapipe copied to clipboard

pose landmark result in linux python has Nan more than window python

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

OS Platform and Distribution

Mobile device if the issue happens on mobile device

Browser and version if the issue happens on browser

Programming Language and version

MediaPipe version

Bazel version

Solution

Android Studio, NDK, SDK versions (if issue is related to building in Android environment)

Xcode & Tulsi version (if issue is related to building for iOS)

Describe the actual behavior

Describe the expected behaviour

Standalone code/steps you may have used to try to get what you need

Other info / Complete Logs

mediapipe
mediapipe copied to clipboard