mediapipe
mediapipe copied to clipboard
[Android] Selfie Segmentation slow inference, GPU not working
Have I written custom code (as opposed to using a stock example script provided in MediaPipe)
No
OS Platform and Distribution
Android 15
MediaPipe Tasks SDK version
com.google.mediapipe:tasks-vision:0.10.14
Task name (e.g. Image classification, Gesture recognition etc.)
Image Classification/Selfie Segmentation
Programming Language and version (e.g. C++, Python, Java)
Java/Kotlin
Describe the actual behavior
GPU delegate either crashes or inference fails
Describe the expected behaviour
GPU delegate should work, offering quicker latency than CPU
Standalone code/steps you may have used to try to get what you need
NOTE: This is a duplicate of the issue I raised on the mediapipe-samples Github repo. Posting here due to lack of response on mediapipe-samples. https://github.com/google-ai-edge/mediapipe-samples/issues/535
The GPU delegate doesn't seen to work on the Selfie Segmentation demo for Android, and I get very poor performance on CPU. I require good latency for use in AR applications. https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/image_segmentation/android
Using the Selfie-Segmenter model (selfie_segmenter.tflite) on Livestream mode and CPU delegate, I get inference times of 90+ ms on average. If the input image is downsized by 60%, I get 60+ ms, but this also gives unacceptably poor segmentation. The DeepLabsV3 model (deeplab_v3.tflite) performs worse at 200+ ms.
I have tried:
Both category and confidence modes Front/Back camera Downsizing input bitmap/MPImage DeepLabsV3 and Selfie Segmenter models com.google.mediapipe:tasks-vision versions 0.10.14 and 0.20230731 I am using a Google Pixel 9 device (Android 15), running on the CPU delegate.
The problem appears to be with the GPU delegate not working on the Android app, forcing the use of the much slower CPU delegate. When I enable GPU delegate, inference completely fails or becomes extremely slow, or on certain devices, appears to crash.
The WEB version works perfectly on the same device with GPU delegate, with <3ms inference. With CPU, inference shoots up to 120+ ms (as expected). https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/image_segmentation/js
Other info / Complete Logs