tensorflow icon indicating copy to clipboard operation
tensorflow copied to clipboard

tf.linalg.normalize generates wrong output in tflite version running on mobile GPU

Open MinaBabahaji-ML opened this issue 11 months ago • 2 comments

1. System information

  • Android S10 snapdragon 855
  • TensorFlow built from source
  • TensorFlow 2.15.0

2. Issue Summary:

When running a TensorFlow Lite model (tf.linalg.normalize layer) on an Android 12 device with TensorFlow version 2.15 built from source, using the OpenCL delegate, I am experiencing wrong outputs (This is not hapenning for alll the inputs, but some inputs). Model is converted with fp16 quantization. This issue can't be reproduced with the XNNPACK delegate. The model is shared in the github repository, but I could regenerate the issue with the following dummy model:

import tensorflow as tf

def build_dummy_model(input_shape, name="dummy_model"):
    inputs = tf.keras.Input(shape=input_shape, name="input_layer")
    x_norm = tf.linalg.normalize(inputs, axis=2, name=f"{name}_norm_pred_bones")[0]
    model = tf.keras.Model(inputs=inputs, outputs=x_norm, name=name)
    return model

input_shape = (26, 3) 
model = build_dummy_model(input_shape)

2. Code

I did not encounter any issues in converting the model, but when running the converted model on the phone with the OpenCL delegate, the output has a big distance from XNNPACK delegate and keras model. The code, model and steps to reproduce the problem can be found in the following GitHub repository: GitHub Repository Link

3. logs:

INFO: Created TensorFlow Lite delegate for GPU.
INFO: Initialized TensorFlow Lite runtime.
VERBOSE: Replacing 4 out of 4 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions for the whole graph.
INFO: Initialized OpenCL-based API.
INFO: Created 1 GPU delegate kernels.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
VERBOSE: Replacing 2 out of 4 node(s) with delegate (TfLiteXNNPackDelegate) node, yielding 3 partitions for the whole graph.
OpenCL output:
-0, 0, 0, -0, 0, 0, 0, 0, -0, -0.0643921, -0.828613, -0.555664, 0.235107, -0.74707, -0.62207, -0.233398, -0.95752, -0.165894, 0.585449, -0.73291, -0.346191, 0.0108871, -0.659668, -0.751465, 0.243408, -0.935059, -0.25708, -0, 0, -0, -0, 0, -0, -0.0350647, 0.999023, -0.00702286, -0, -0, -0, 0, 0, 0, -0.361328, 0.753418, 0.548828, 0.329346, 0.64209, -0.691895, 0.298828, 0.947754, -0.106201, -0.62207, -0.208618, -0.754395, 0.107788, 0.822266, -0.559082, 0.242432, 0.935059, 0.258545, 0.186646, 0.345703, -0.919922, -0.0539856, 0.639648, 0.76709, -0.193604, 0.981445, -0.0037384, -0.0666504, 0.289795, 0.955078, 0.345215, 0.792969, 0.502441, 0.0843506, -0.280762, 0.956055, 
xnnpack output:
-0.104339, 0.939571, 0.326067, -0.953553, 0.295249, 0.0597043, 0.886078, 0.461906, -0.0388445, -0.0644245, -0.828944, -0.555608, 0.235118, -0.747084, -0.62176, -0.23349, -0.958077, -0.166043, 0.585523, -0.733017, -0.346193, 0.0108894, -0.659484, -0.751639, 0.243398, -0.935215, -0.257159, -0.113086, 0.989401, -0.0910858, -0.119016, 0.962996, -0.241812, -0.0350888, 0.999359, -0.00702636, -0.999308, -0.0174536, -0.0328572, 0.999252, 0.0147311, 0.0357636, -0.361413, 0.753619, 0.549035, 0.329349, 0.642243, -0.692137, 0.298923, 0.948343, -0.106257, -0.622155, -0.208634, -0.754582, 0.107761, 0.822209, -0.558891, 0.242461, 0.935058, 0.258611, 0.186559, 0.345574, -0.91966, -0.0539725, 0.639653, 0.766767, -0.193588, 0.981076, -0.00373641, -0.0666187, 0.289595, 0.954828, 0.345324, 0.7928, 0.502215, 0.0843756, -0.28086, 0.956033, 

MinaBabahaji-ML avatar Apr 02 '24 15:04 MinaBabahaji-ML

Hi @MinaBabahaji-ML, I followed your directions exactly with a fresh git clone, on this step:

adb shell "cd /data/local/tmp && LD_LIBRARY_PATH=. ./model_test --model=model_files/sample.tflite --output_shape=1,26,3"

I get this error:

terminating with uncaught exception of type std::runtime_error: You must provide input and output shapes.
Aborted 

I am using an emulator but I do not believe this issue is related to that, perhaps you are more familiar with your code so if you know how to fix this please let me know.

pkgoogle avatar Apr 03 '24 17:04 pkgoogle

Thank you pkgoogle. Could you please confirm that after cloning the repository, you checked out the right branch? (git checkout normalize_layer).

It seems that you are on main branch

MinaBabahaji-ML avatar Apr 03 '24 19:04 MinaBabahaji-ML

Hi @MinaBabahaji-ML, Thank you... I have openCL installed on my mac and my AVD has graphics set to Automatic and I run into this issue:

adb shell "cd /data/local/tmp && LD_LIBRARY_PATH=. ./model_test --model=model_files/sample.tflite --output_shape=1,26,3"
INFO: Created TensorFlow Lite delegate for GPU.
INFO: Initialized TensorFlow Lite runtime.
VERBOSE: Replacing 4 out of 4 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions for the whole graph.
ERROR: Can not open OpenCL library on this device - undefined symbol: clGetCommandBufferInfoKHR
ERROR: Falling back to OpenGL
ERROR: TfLiteGpuDelegate Init: [GL_INVALID_VALUE]: A numeric argument is out of range.
INFO: Created 0 GPU delegate kernels.
ERROR: TfLiteGpuDelegate Prepare: delegate is not initialized
ERROR: Node number 4 (TfLiteGpuDelegateV2) failed to prepare.
ERROR: Restored original execution plan after delegate application failure.
Segmentation fault 

This is likely an emulator problem, Hi @miaout17, can you please take a look? Thanks.

pkgoogle avatar Apr 04 '24 20:04 pkgoogle