Feature Request: GPU support on Android ?
On Android, I only see the CPU option for Gemma3, and the model runs very slowly. When will GPU acceleration be available ?
hoping to support Qualcomm GPU/NPU
Just download a *.task file from huggingface and import it manually. It will give you an option to enable GPU acceleration (assuming your GPU supports it). gemma-3n-E2B-it-int4.task | gemma-3n-E4B-it-int4.task
Supposedly you can switch even the existing models to run on GPU by using the settings button in the upper right hand corner, but I've never gotten a model to work on the GPU on android. It just loads forever.
GPU definitely works for the Qwen 2.5-based models on a pixel.
Phi-4, however, had issues with GPU:
Following operations are not supported by GPU delegate:
CAST: Tensor type(INT64) is not supported. ai_edge_torch.generative.utilities.converter.ExportableModule/ai_edge_torch.generative.examples.phi.phi4.Phi4Mini_module;130
But it too works with CPU anyway. See my comment in issue #7.
Phi4/Qwen2.5 GPU wouldn't work on the Android Emulator sdk=36 images (at least for x86_64) due to missing libraries.
Also, for any devs who might be out there-- there are 16-bit alignment issues w/the tasks_text library:
APK app-debug.apk is not compatible with 16 KB devices. Some libraries have LOAD segments not aligned at 16 KB boundaries:
lib/arm64-v8a/libmediapipe_tasks_text_jni.so
lib/arm64-v8a/libmediapipe_tasks_vision_image_generator_jni.so
Wonder if GPU option only work on Pixel series, or high-end phone? Or the app don't check GPU mem ?Gemma3-1b work fine in CPU mode, but crash when switch to GPU. Redmi note 13.
Dunno I haven't tried any of the gemma models, just the qwen2.5 and phi4. And in case anyone wants it, I put most of my changes in a PR (#14).
Wonder if GPU option only work on Pixel series, or high-end phone? Or the app don't check GPU mem ?Gemma3-1b work fine in CPU mode, but crash when switch to GPU. Redmi note 13.
"Gemma-3-n-E2B" works on GPU: "Galaxy Tab S9 Snapdragon 8 Gen 2 with the Adreno 740 GPU"
Interestingly, im seeing a strange issue on s25 ultra
Sometimes it infinitely loads and sometimes it crashes the app.
@ltphung
Gemma3-1b work fine in CPU mode, but crash when switch to GPU. Redmi note 13.
The downloaded task file for the E2B model as linked by @sevenreasons kinda works (after rebooting it no longer works) on Itel P55 5G in GPU mode, which have the identical SoC (Mediatek Dimensity 6080) and only half of your phone RAM. The E4B model always crash in GPU mode, but you might have better luck with your larger RAM since I suspect the crashing is mainly due to the measly 6 GB RAM on P55.
It's not as speedy as the pixel, but the performance is still quite acceptable.
Even on OnePlus 13 (Snapdragon 8 Elite) GPU model outputs seemingly random gibberish. Also it's worth noting that even llama.cpp with the -ngl parameter faces similar issues using Vulkan backend.
Supposedly you can switch even the existing models to run on GPU by using the settings button in the upper right hand corner, but I've never gotten a model to work on the GPU on android. It just loads forever.
This issue seems to be fixed in the latest version. Gemma 3 1B and and Gemma 3n E2B both work on GPU on Pixel 9 Pro running Android 16 (BP31.250502.008). Loading the model takes longer on GPU for some reason though.
Same issue on s25+
Interestingly, im seeing a strange issue on s25 ultra
Sometimes it infinitely loads and sometimes it crashes the app.
Same issue on Xiaomi 15, SNapdragon 8 Elite toooo on gpu
Thank you @ltphung for reporting this and everyone contributing! Sorry for the delay; this bug is logged & passed to the team. Please let us know if the issue persists in the latest version.
Hi all,
As we continue our investigation into this issue, we are requesting a full Bug Report. This will be a great help to our engineers in diagnosing the root cause.
We've put together a step-by-step guide on how to capture and share one here: https://github.com/google-ai-edge/gallery/blob/main/Bug_Reporting_Guide.md
Thank you for your ongoing help and patience.