tensorflow icon indicating copy to clipboard operation
tensorflow copied to clipboard

Error when loading TFLite model on Android with GPU Delegate

Open mitsunami opened this issue 2 years ago • 15 comments

1. System information

  • OS Platform and Distribution: TFLite conversion on Windows 10 and run models on Android 13
  • TensorFlow installation: pip package
  • TensorFlow library: 2.14.0

2. Code

I understand the importance of providing a reproducible code for better troubleshooting. Given the size and complexity of the model, I'm unable to provide a simplified version immediately. However, I am actively working on preparing one to help diagnose the issue more effectively.

In the meantime, if there are any insights, workarounds, how to debug, or known issues that you can infer from the error message I've shared below, it would be immensely helpful.

3. Failure after conversion

Model produces correct results on my PC, but an error occurs when tyring to load it on an Android device. Any insights or solutions would be greatly appreciated. I'd like to know if there's something I'm missing or if this is a known issue.

5. Any other info / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

I'm currently trying to run a TFLite model on an Android mobile device using the GPU Delegate. While the converted TFLite model works perfectly on my PC, I encounter an error when trying to load it on the mobile device. Here's the error message I received:

Internal error: Failed to apply delegate: Failed to build program executable - Build program failure<source>:69:103: error: expected expression
{half4 second_value = read_imageh(src_tensor_1_image2d, smp_zero, (int2)(((0) * shared_int4_0.w + (())), ((0) * shared_int4_1.x + ((Z)))));
                                                                                                 ^
error: Compiler frontend failed (error code 63)
Falling back to OpenGL

TfLiteGpuDelegate Init: Batch size mismatch, exp
        at org.tensorflow.lite.NativeInterpreterWrapper.createInterpreter(Native Method)
        at org.tensorflow.lite.NativeInterpreterWrapper.init(NativeInterpreterWrapper.java:110)
  	at org.tensorflow.lite.NativeInterpreterWrapper.<init>(NativeInterpreterWrapper.java:73)
  	at org.tensorflow.lite.NativeInterpreterWrapperExperimental.<init>(NativeInterpreterWrapperExperimental.java:36)
  	at org.tensorflow.lite.Interpreter.<init>(Interpreter.java:232)
  	at com.XXX.ModelHelper.loadModelFromStorage(ModelHelper.kt:171)

While there's no issue loading with TFLite CPU, I encounter the following error during execution. What's weird is that even though I'm feeding the same size of data on both PC and mobile, this error only appears on the mobile.

Shutting down VM
FATAL EXCEPTION: main
Process: com.XXX, PID: 8445
java.lang.IllegalStateException: Internal error: Unexpected failure when preparing tensor allocations: tensorflow/lite/kernels/reshape.cc:92 num_input_elements != num_output_elements (81920 != 1310720)
Node number 995 (RESHAPE) failed to prepare.
	at org.tensorflow.lite.NativeInterpreterWrapper.allocateTensors(Native Method)
	at org.tensorflow.lite.NativeInterpreterWrapper.allocateTensorsIfNeeded(NativeInterpreterWrapper.java:308)
	at org.tensorflow.lite.NativeInterpreterWrapper.run(NativeInterpreterWrapper.java:248)
	at org.tensorflow.lite.InterpreterImpl.runForMultipleInputsOutputs(InterpreterImpl.java:101)
	at org.tensorflow.lite.Interpreter.runForMultipleInputsOutputs(Interpreter.java:95)
	at com.XXX.ModelHelper.runModel(ModelHelper.kt:316)

I have confirmed that the same error occurs on at least two different Android devices (Pixel 7 Pro and Galaxy S20 Exynos).

Thank you for your understanding and assistance. I truly appreciate any guidance you can provide.

mitsunami avatar Oct 16 '23 11:10 mitsunami

Hi @mitsunami

It is hard to say from the error log, I can guess that there is a problem with the input data being passed with android. Please check the input tensor shapes and resize before during inference accordingly. Also, you need to see of the data chunk being passed is of 81920.

Another possibility is if the loaded model in TFLite does not have any defined batch size, converter will take the batch size as 1, and when you evaluate it with the different batch size, you are likely to end up with the problem which you are facing.

Thanks.

pjpratik avatar Oct 18 '23 06:10 pjpratik

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] avatar Oct 26 '23 01:10 github-actions[bot]

Hi @pjpratik

Thank you for your reply, and sorry for my late reply. I've prepared a reproducible code for the error. Could you please check it out please? https://colab.research.google.com/gist/mitsunami/fa5cacac520bde7d446441155bbb7479/tensorflow-lite-debugger-colab.ipynb

When you run the code in the above Colab notebook, a TFLite model named model_fixed_batch.tflite will be generated. As mentioned, the model can be executed correctly on Colab. However, when trying to load this model on Android with the GPU Delegate option, an error occurs.

To reproduce this, please follow the steps below:

  1. Unzip the Android project ZIP file in the link: https://drive.google.com/file/d/1CvB8NemWQzaY0AaWcQEh1Y-3BeHN2uiS/view?usp=sharing
  2. Copy the TFLite file you generated earlier to the path android_repro\app\src\main\assets.
  3. Build the project using Android Studio, install it on the device, and run it. (This app is a slightly modified version of the StyleTransfer app from tensorflow/examples, designed to reproduce the error.)
  4. After running the app, when you change the Delegate to GPU and click the Run button, you should be able to see the error message in Android Studio's Logcat.

If you have any questions about the reproduction steps, please let me know.

Regarding the data size you pointed out, I have confirmed that the batch size is fixed to 1 during TFLite conversion. Also, since it's an error during model loading, the input data size should not be an issue here.

Thanks for your support.

mitsunami avatar Oct 26 '23 15:10 mitsunami

Hi @pjpratik

I've sent a reproducible code for the error that I'm facing. If you could check it and provide an update, that would be great.

Thank you!

mitsunami avatar Oct 31 '23 10:10 mitsunami

Hi @mitsunami, I'm looking into this but in the mean time you might want to check if you do any broadcasting in your model, the GPU delegate generally does not handle this case very well currently ex: https://github.com/tensorflow/tensorflow/issues/60043 .

pkgoogle avatar Oct 31 '23 19:10 pkgoogle

Hi @mitsunami, I was able to run your project on CPU, I tried to change the project to run on GPU by changing MainViewModel:31 from 0 to 1 (which is the constant for the GPU DELEGATE), that didn't seem to reproduce your issue. Can you explain to me how you

change the Delegate to GPU

So that I know we are doing the same thing. Thanks.

pkgoogle avatar Oct 31 '23 22:10 pkgoogle

Hi @pkgoogle,

Thanks for looking into this. Please do the following to change the Delegate to GPU: When the application is launched, the camera is activated. After taking a picture, the attached screen will appear, where you can change the Delegate field circled in red to GPU. Then press the RUN button below. (You don't have to change MainViewModel code.)

Please let me know if you any questions. Thanks.

Screenshot_20231101-142051_resized

mitsunami avatar Nov 01 '23 14:11 mitsunami

Please note that the app is just a modified version of the existing StyleTransfer app for the purpose of loading the model in question. So if you press RUN, the app itself will work fine. However, when you press RUN, the model in question is loaded, and you should see the error that occurs when loading the model on Logcat in Android Studio. That is the issue I would like you to see.

Thank you.

mitsunami avatar Nov 01 '23 14:11 mitsunami

Got it, I did that and did not run into your same issue, I got some errors but they seem unrelated:

image

TFLiteXNNPackDelegate seems to be used though, maybe something related to me using an emulator. Can you let me know what API level you are using for your pixel 7 Pro?

pkgoogle avatar Nov 02 '23 20:11 pkgoogle

I haven't tried it with an emulator, but it might be related to it. I'm away from my PC right now, but I'll check with an emulator on my end later to see if it's same as you.

The API level I'm using should be 33.

Thanks.

mitsunami avatar Nov 02 '23 22:11 mitsunami

Hi @pkgoogle, I tried with an emulator, but it doesn't reproduce the error. Please use some physical devices. I confirmed that at least the issues occurs with the two phones I've tried (Pixel 7 Pro and Galaxy S21 Exynos). Thanks.

mitsunami avatar Nov 03 '23 11:11 mitsunami

Hi @arfaian, can you please take a look? Thanks.

pkgoogle avatar Nov 03 '23 22:11 pkgoogle

Hi @arfaian, do you have any updates on this? If you could help on this, that would be great. Thanks.

mitsunami avatar Nov 13 '23 10:11 mitsunami

I also have same issue here

arifaizin avatar Nov 22 '23 06:11 arifaizin

I also experienced the same issue here.

yumtaufikhidayat avatar Feb 03 '24 13:02 yumtaufikhidayat

Same here

kaka-lin avatar Aug 12 '24 10:08 kaka-lin