mediapipe icon indicating copy to clipboard operation
mediapipe copied to clipboard

Mediapipe Android Pose Landmarker doesn't return visibility

Open Astro-Abhi opened this issue 1 year ago • 3 comments

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

No

OS Platform and Distribution

Android 13

MediaPipe Tasks SDK version

Latest

Task name (e.g. Image classification, Gesture recognition etc.)

Pose Estimation

Programming Language and version (e.g. C++, Python, Java)

Kotlin

Describe the actual behavior

It says in the doc mediapipe pose landmark should return visibility and presence, but in reality the output only returns x, y, z.

Describe the expected behaviour

The visibility 0.0 ~ 1.0 should be returned along with xyz.

Standalone code/steps you may have used to try to get what you need

fun detectVideoFile(
        videoUri: Uri,
        inferenceIntervalMs: Long
    ): ResultBundle? {
        if (runningMode != RunningMode.VIDEO) {
            throw IllegalArgumentException(
                "Attempting to call detectVideoFile" +
                        " while not using RunningMode.VIDEO"
            )
        }

        // Inference time is the difference between the system time at the start and finish of the
        // process
        val startTime = SystemClock.uptimeMillis()

        var didErrorOccurred = false

        // Load frames from the video and run the pose landmarker.
        val retriever = MediaMetadataRetriever()
        retriever.setDataSource(context, videoUri)
        val videoLengthMs =
            retriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_DURATION)
                ?.toLong()

        // Note: We need to read width/height from frame instead of getting the width/height
        // of the video directly because MediaRetriever returns frames that are smaller than the
        // actual dimension of the video file.
        val firstFrame = retriever.getFrameAtTime(0)
        val width = firstFrame?.width
        val height = firstFrame?.height

        // If the video is invalid, returns a null detection result
        if ((videoLengthMs == null) || (width == null) || (height == null)) return null

        // Next, we'll get one frame every frameInterval ms, then run detection on these frames.
        val resultList = mutableListOf<PoseLandmarkerResult>()
        val numberOfFrameToRead = videoLengthMs.div(inferenceIntervalMs)

        for (i in 0..numberOfFrameToRead) {
            val timestampMs = i * inferenceIntervalMs // ms

            retriever
                .getFrameAtTime(
                    timestampMs * 1000, // convert from ms to micro-s
                    MediaMetadataRetriever.OPTION_CLOSEST
                )
                ?.let { frame ->
                    // Convert the video frame to ARGB_8888 which is required by the MediaPipe
                    val argb8888Frame =
                        if (frame.config == Bitmap.Config.ARGB_8888) frame
                        else frame.copy(Bitmap.Config.ARGB_8888, false)

                    // Convert the input Bitmap object to an MPImage object to run inference
                    val mpImage = BitmapImageBuilder(argb8888Frame).build()

                    // Run pose landmarker using MediaPipe Pose Landmarker API
                    poseLandmarker?.detectForVideo(mpImage, timestampMs)
                        ?.let { detectionResult ->
                            resultList.add(detectionResult)
                            Log.d("Result", "detectVideoFile: $detectionResult")
                        } ?: {
                        didErrorOccurred = true
                        poseLandmarkerHelperListener?.onError(
                            "ResultBundle could not be returned" +
                                    " in detectVideoFile"
                        )
                    }
                }
                ?: run {
                    didErrorOccurred = true
                    poseLandmarkerHelperListener?.onError(
                        "Frame at specified time could not be" +
                                " retrieved when detecting in video."
                    )
                }
        }

        retriever.release()

        val inferenceTimePerFrameMs =
            (SystemClock.uptimeMillis() - startTime).div(numberOfFrameToRead)

        return if (didErrorOccurred) {
            null
        } else {
            ResultBundle(resultList, inferenceTimePerFrameMs, height, width)
        }
    }

Other info / Complete Logs

PoseLandmarkerResult{timestampMs=4500, landmarks=[[<Normalized Landmark (x=0.7510272 y=0.20446618 z=0.40102187)>, <Normalized Landmark (x=0.75105053 y=0.18874092 z=0.38066855)>, <Normalized Landmark (x=0.7466296 y=0.18675937 z=0.38037464)>, <Normalized Landmark (x=0.74211013 y=0.18440872 z=0.38038114)>, <Normalized Landmark (x=0.759866 y=0.19065823 z=0.32648954)>, <Normalized Landmark (x=0.7619234 y=0.19038819 z=0.32652494)>, <Normalized Landmark (x=0.7635577 y=0.18983333 z=0.32619974)>, <Normalized Landmark (x=0.72407436 y=0.18027698 z=0.28573143)>, <Normalized Landmark (x=0.7521517 y=0.18874905 z=0.05386295)>, <Normalized Landmark (x=0.7346416 y=0.20954838 z=0.38131762)>, <Normalized Landmark (x=0.7433315 y=0.21185601 z=0.3133025)>, <Normalized Landmark (x=0.5645031 y=0.21988203 z=0.28452587)>, <Normalized Landmark (x=0.70417345 y=0.2326758 z=-0.15534207)>, <Normalized Landmark (x=0.39411977 y=0.24898353 z=0.2167109)>, <Normalized Landmark (x=0.5763115 y=0.2650254 z=-0.42504558)>, <Normalized Landmark (x=0.34445938 y=0.32853904 z=0.11054065)>, <Normalized Landmark (x=0.50650704 y=0.35534456 z=-0.6509123)>, <Normalized Landmark (x=0.33214808 y=0.34294757 z=0.09639784)>, <Normalized Landmark (x=0.48827752 y=0.37619108 z=-0.70628023)>, <Normalized Landmark (x=0.33551404 y=0.3498937 z=0.06902404)>, <Normalized Landmark (x=0.48986396 y=0.37434134 z=-0.6831321)>, <Normalized Landmark (x=0.34972334 y=0.34181398 z=0.09232962)>, <Normalized Landmark (x=0.49287707 y=0.36660773 z=-0.6412095)>, <Normalized Landmark (x=0.37140197 y=0.37960747 z=0.15066384)>, <Normalized Landmark (x=0.44864574 y=0.41226047 z=-0.15140514)>, <Normalized Landmark (x=0.08444536 y=0.366516 z=0.20823258)>, <Normalized Landmark (x=0.41422993 y=0.5776739 z=0.027013212)>, <Normalized Landmark (x=-0.023925573 y=0.3070272 z=0.29400828)>, <Normalized Landmark (x=0.30513224 y=0.7215388 z=-0.005944845)>, <Normalized Landmark (x=-0.03269866 y=0.30536428 z=0.30426502)>, <Normalized Landmark (x=0.2649097 y=0.7413746 z=-0.007665344)>, <Normalized Landmark (x=-0.058810532 y=0.26525003 z=0.28150305)>, <Normalized Landmark (x=0.36282322 y=0.7745156 z=-0.08562668)>]], worldLandmarks=[[<Landmark (x=0.46985126 y=-0.47196674 z=0.1478796)>, <Landmark (x=0.46258533 y=-0.51280403 z=0.14124393)>, <Landmark (x=0.46293238 y=-0.5128404 z=0.14193368)>, <Landmark (x=0.46281034 y=-0.51342845 z=0.14192939)>, <Landmark (x=0.47727853 y=-0.49941885 z=0.1216836)>, <Landmark (x=0.47812808 y=-0.4999135 z=0.12091541)>, <Landmark (x=0.47843117 y=-0.50093347 z=0.12147069)>, <Landmark (x=0.38121313 y=-0.5300129 z=0.10955334)>, <Landmark (x=0.44070554 y=-0.4793768 z=0.017568588)>, <Landmark (x=0.43289277 y=-0.46129113 z=0.14149714)>, <Landmark (x=0.44835123 y=-0.44567612 z=0.11649203)>, <Landmark (x=0.20222321 y=-0.423948 z=0.11564782)>, <Landmark (x=0.36089224 y=-0.36341637 z=-0.0664947)>, <Landmark (x=-0.056398958 y=-0.37365085 z=0.07438302)>, <Landmark (x=0.20612866 y=-0.26976034 z=-0.16823179)>, <Landmark (x=-0.08823899 y=-0.18717325 z=0.054798365)>, <Landmark (x=0.101558864 y=-0.1160048 z=-0.25389218)>, <Landmark (x=-0.099629074 y=-0.13966224 z=0.05096054)>, <Landmark (x=0.07877825 y=-0.07769191 z=-0.25890446)>, <Landmark (x=-0.09806381 y=-0.11433469 z=0.049022436)>, <Landmark (x=0.07216971 y=-0.062350847 z=-0.24894929)>, <Landmark (x=-0.08902025 y=-0.16402495 z=0.050266743)>, <Landmark (x=0.09604712 y=-0.09728256 z=-0.24684787)>, <Landmark (x=-0.050692163 y=-0.035051674 z=0.06190157)>, <Landmark (x=0.050913267 y=0.034398697 z=-0.060181618)>, <Landmark (x=-0.43571985 y=-0.031075954 z=0.0860064)>, <Landmark (x=0.0050947517 y=0.4016812 z=0.008675814)>, <Landmark (x=-0.5811336 y=-0.20900738 z=0.13575506)>, <Landmark (x=-0.14636354 y=0.77901226 z=0.012548447)>, <Landmark (x=-0.5930678 y=-0.22128525 z=0.13559675)>, <Landmark (x=-0.16841054 y=0.8227561 z=0.017086029)>, <Landmark (x=-0.63269955 y=-0.26378265 z=0.13518286)>, <Landmark (x=-0.12929541 y=0.88428783 z=-0.008642197)>]]

Astro-Abhi avatar Jun 13 '23 05:06 Astro-Abhi

@Astro-Abhi,

This issue is already known to us (see #4409). We are working towards fix and will let you know once any update will available from our end. Thank you

kuaashish avatar Jun 15 '23 06:06 kuaashish

I have the same issue...Is there any other way to get the visibility value..

kimgaheeme avatar Jul 26 '23 04:07 kimgaheeme

I'm facing the same problem, not getting the values of visibility nor presence.

ver0z avatar Aug 03 '23 21:08 ver0z

I have the same issue, But this it seem solved when I update the library to 0.10.9

This example is base on Google example:

override fun draw(canvas: Canvas) {
        super.draw(canvas)
        results?.let { poseLandmarkerResult ->
            for(landmark in poseLandmarkerResult.landmarks()) {
                for(normalizedLandmark in landmark) {
                    canvas.drawPoint(
                        normalizedLandmark.x() * imageWidth * scaleFactor,
                        normalizedLandmark.y() * imageHeight * scaleFactor,
                        pointPaint
                    )
                    val visibiblility:Float = normalizedLandmark.visibility().orElse(0.0F)
                    Log.i("Position X",normalizedLandmark.x().toString())
                    Log.i("Visibility",visibiblility.toString())
                }

                PoseLandmarker.POSE_LANDMARKS.forEach {
                    canvas.drawLine(
                        poseLandmarkerResult.landmarks()[0][it!!.start()].x() * imageWidth * scaleFactor,
                        poseLandmarkerResult.landmarks()[0][it.start()].y() * imageHeight * scaleFactor,
                        poseLandmarkerResult.landmarks()[0][it.end()].x() * imageWidth * scaleFactor,
                        poseLandmarkerResult.landmarks()[0][it.end()].y() * imageHeight * scaleFactor,
                        linePaint)
                }
            }
        }
    }

Debug the landmark you can see this, it work😘

<Normalized Landmark (x=0.5446126 y=0.37933117 z=-0.82241935 visibility= Optional[0.99939287] presence=Optional[0.9798227])>

mocusez avatar Feb 22 '24 13:02 mocusez

Hi @Astro-Abhi,

It appears that the issue has been resolved, according to the findings by @mocusez. MediaPipe now provides visibility for both Android and JavaScript platforms. Please ensure you are using version 0.10.9 or the latest version, and let us know if you are still encountering similar output.

Thank you!!

kuaashish avatar Apr 12 '24 09:04 kuaashish

HI @kuaashish In this code what version is used? the visibility is not returned here either

MaryamBoneh avatar Apr 14 '24 11:04 MaryamBoneh

Hi @MaryamBoneh,

When reviewing the code in the JS file of the CodePen example

Screenshot 2024-04-15 at 11 37 36 AM

I noticed it currently utilizes version 0.10.0. We believe it would be beneficial to upgrade to the latest version in demonstration always. Thank you for bringing this to our attention.

However, if you remove the explicit version value in the JS file from

const createPoseLandmarker = async () => {
  const vision = await FilesetResolver.forVisionTasks(
    "https://cdn.jsdelivr.net/npm/@mediapipe/[email protected]/wasm"
  );

To this

const createPoseLandmarker = async () => {
  const vision = await FilesetResolver.forVisionTasks(
    "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision/wasm"
  );

It should automatically fetch the newest stable version, which should return the visibility and presence as expected.

Thank you!!

kuaashish avatar Apr 15 '24 06:04 kuaashish

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions[bot] avatar Apr 23 '24 01:04 github-actions[bot]

@kuaashish yes , the visibility is coming in the latest version

thank you

Astro-Abhi avatar Apr 23 '24 05:04 Astro-Abhi

Hi @Astro-Abhi,

Thank you for confirming. May we proceed to mark this issue as resolved and close its status?

kuaashish avatar Apr 23 '24 06:04 kuaashish

Yes, thank you

Astro-Abhi avatar Apr 23 '24 06:04 Astro-Abhi

Are you satisfied with the resolution of your issue? Yes No

google-ml-butler[bot] avatar Apr 23 '24 06:04 google-ml-butler[bot]