MiDaS icon indicating copy to clipboard operation
MiDaS copied to clipboard

Is it possible to use midas_small_v2_1 on coral edge tpu?

Open joaosbastos opened this issue 2 years ago • 10 comments

Hello,

is it possible to use midas_small_v2_1 model on coral edge tpu? Is there a way to convert it?

Best regards.

joaosbastos avatar Jul 14 '21 14:07 joaosbastos

We don't have a coral edge tpu at our disposal to try, but after a quick look at the compatibility overview (https://coral.ai/docs/edgetpu/models-intro/#compatibility-overview) it seems that this should be possible in principle.

You can find a TF-Lite model of MiDaS here: https://github.com/intel-isl/MiDaS/releases/download/v2_1/model_opt.tflite. Perhaps you could try deploying this model and report back so that others could benefit from your insights?

ranftlr avatar Jul 15 '21 19:07 ranftlr

I tried compiling with

edgetpu_compiler model_opt.tflite

and got

Edge TPU Compiler version 15.0.340273435
Invalid model: model_opt.tflite
Model not quantized

Coral suggests full-integer quantization in https://www.tensorflow.org/lite/performance/post_training_quantization?hl=nl I used the following script in python with the .pb model

import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_saved_model('.')
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
tflite_quant_model = converter.convert()
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)

Got the following

ValueError: NodeDef mentions attr 'output_shapes' not in Op<name=StatelessIf; signature=cond:Tcond, input: -> output:; attr=Tcond:type; attr=Tin:list(type),min=0; attr=Tout:list(type),min=0; attr=then_branch:func; attr=else_branch:func>; NodeDef: {{node cond}}. (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.)

I'm still a beginner on tensorflow.

joaosbastos avatar Jul 16 '21 13:07 joaosbastos

Googling suggests that this error is likely due to using a different TensorFlow version than the one that was used to build the model. Our model was tested with 2.3.0 (c.f. https://github.com/intel-isl/MiDaS/tree/master/tf).

ranftlr avatar Jul 20 '21 08:07 ranftlr

Tried with 2.5.0 and that error is gone. Now it asks for a representative dataset as suggested on https://www.tensorflow.org/lite/performance/post_training_quantization#full_integer_quantization_of_weights_and_activations

Full integer quantization
You can get further latency improvements, reductions in peak memory usage, and compatibility with integer only hardware devices or accelerators by making sure all model math is integer quantized.

For full integer quantization, you need to calibrate or estimate the range, i.e, (min, max) of all floating-point tensors in the model. Unlike constant tensors such as weights and biases, variable tensors such as model input, activations (outputs of intermediate layers) and model output cannot be calibrated unless we run a few inference cycles. As a result, the converter requires a representative dataset to calibrate them. This dataset can be a small subset (around ~100-500 samples) of the training or validation data. Refer to the representative_dataset() function below.


def representative_dataset():
  for data in tf.data.Dataset.from_tensor_slices((images)).batch(1).take(100):
    yield [tf.dtypes.cast(data, tf.float32)]
For testing purposes, you can use a dummy dataset as follows:


def representative_dataset():
    for _ in range(100):
      data = np.random.rand(1, 244, 244, 3)
      yield [data.astype(np.float32)]

Where can i get such dataset?

joaosbastos avatar Jul 27 '21 14:07 joaosbastos

@joaosbastos You can use any 100-500 RGB-images as representative dataset to calibrate the range of values for quantization. You can just try to use RGB-images from RedWeb-dataset: https://drive.google.com/file/d/12IjUC6eAiLBX67jW57YQMNRVqUGvTZkX/view Or better to use your custom real images on which this model will be used.

AlexeyAB avatar Jul 28 '21 19:07 AlexeyAB

Made it to convert a model with your help with the following python script:

import tensorflow as tf
import cv2
import os
import numpy as np

def representative_data():
	a = []
	directory_rgb = r'ReDWeb_V1/Imgs'
	directory_depth = r'ReDWeb_V1/RDs'
	for filename in os.listdir(directory_rgb):
		if filename.endswith(".jpg"):
			print(os.path.join(directory_rgb, filename))
			print(os.path.join(directory_depth, filename).split('.')[0]+'.png')
			img = cv2.imread(os.path.join(directory_rgb, filename))
			img = img / 255.0
			img = img.astype('float32')
			
			img = tf.image.resize(img, [256,256], method='bicubic', preserve_aspect_ratio=False)
			img = tf.transpose(img, [2, 0, 1])
			
			a.append(img)
		else:
			continue
	a = np.array(a)
	print(a.shape)
	img = tf.data.Dataset.from_tensor_slices(a).batch(1)
	for i in img.take(a.shape[0]):
		print(i)
		yield [i]

converter = tf.lite.TFLiteConverter.from_saved_model('.')
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
tflite_quant_model = converter.convert()

with open('model.tflite', 'wb') as f:
  f.write(tflite_quant_model)

Then on the generated model i run the edgetpu_compiler and got this output:

edgetpu_compiler model.tflite
Edge TPU Compiler version 16.0.384591198
Started a compilation timeout timer of 180 seconds.

Model compiled successfully in 9208 ms.

Input model: model.tflite
Input size: 25.33MiB
Output model: model_edgetpu.tflite
Output size: 25.38MiB
On-chip memory used for caching model parameters: 80.00KiB
On-chip memory remaining for caching model parameters: 6.04MiB
Off-chip memory used for streaming uncached model parameters: 384.00KiB
Number of Edge TPU subgraphs: 1
Total number of operations: 17629
Operation log: model_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 3
Number of operations that will run on CPU: 17626
See the operation log file for individual operation details.
Compilation child process completed within timeout period.
Compilation succeeded! 

And the compiler log:

Edge TPU Compiler version 16.0.384591198
Input: model.tflite
Output: model_edgetpu.tflite

Operator                       Count      Status

RESIZE_BILINEAR                5          More than one subgraph is not supported
SUB                            1          Mapped to Edge TPU
TRANSPOSE                      54         More than one subgraph is not supported
TRANSPOSE                      126        Operation is otherwise supported, but not mapped due to some unspecified limitation
MINIMUM                        48         More than one subgraph is not supported
CONCATENATION                  24         More than one subgraph is not supported
CONV_2D                        17001      More than one subgraph is not supported
MUL                            1          Mapped to Edge TPU
MUL                            72         More than one subgraph is not supported
SPLIT                          97         More than one subgraph is not supported
ADD                            99         More than one subgraph is not supported
RESHAPE                        1          More than one subgraph is not supported
RELU                           55         More than one subgraph is not supported
PAD                            1          Mapped to Edge TPU
PAD                            44         More than one subgraph is not supported

With so many operations running on CPU don't think it is worth it to run on this Edge. Is there a way to improve?

joaosbastos avatar Aug 05 '21 14:08 joaosbastos

Thanks for sharing your results.

I'm a bit surprised that both CONV_2D and RELU seem to not be mapped to the TPU, as these are certainly supported in principle. Paging @AlexeyAB and @thias15 who might have some experience here?

ranftlr avatar Aug 05 '21 15:08 ranftlr

@joaosbastos

What command did you use to compile?

Try to compile with -a flag, like this edgetpu_compiler -sa model.tflite


There are 3 approaches/versions of EfficientNet:

  1. EfficientNet GPU/TPU: Depth-Wise-Conv2d, SE, Swish https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/efficientnet_builder.py#L173

  2. EfficientNet-Lite Mobile/TPU-edge: Depth-Wise-Conv2d, no-SE, ReLU6 https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/lite/efficientnet_lite_builder.py#L47

  3. EfficientNet-TPU-edge: Conv2d, no-SE, ReLU https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/edgetpu/efficientnet_edgetpu_builder.py#L53

We used the 2nd approach and it should fit the TPU-edge based on this information: https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet/lite

these EfficientNet-lite models run well on all mobile CPU/GPU/EdgeTPU

AlexeyAB avatar Aug 05 '21 16:08 AlexeyAB

Reviewing my process.

  1. I downloaded the model from https://tfhub.dev/intel/midas/v2_1_small/1
  2. Converted using the script above
  3. Compiled using edgetpu_compiler -sa model.tflite

Got from the compiler

Edge TPU Compiler version 16.0.384591198
Started a compilation timeout timer of 180 seconds.
loc(fused["transpose_46", "transpose_1/perm"]): error: non-broadcastable operands
loc(fused["transpose_46", "transpose_1/perm"]): error: non-broadcastable operands
Compilation child process completed within timeout period.
Compilation failed! 

Checking your comments, is the model from Tensorflow Hub valid for this?

joaosbastos avatar Aug 06 '21 09:08 joaosbastos

On Tensorflow site https://coral.ai/docs/edgetpu/models-intro/#model-requirements

Model requirements
If you want to build a TensorFlow model that takes full advantage of the Edge TPU for accelerated inferencing, the model must meet these basic requirements:

Tensor parameters are quantized (8-bit fixed-point numbers; int8 or uint8).
Tensor sizes are constant at compile-time (no dynamic sizes).
Model parameters (such as bias tensors) are constant at compile-time.
Tensors are either 1-, 2-, or 3-dimensional. If a tensor has more than 3 dimensions, then only the 3 innermost dimensions may have a size greater than 1.
The model uses only the operations supported by the Edge TPU (see table 1 below).

Tensors are limited to 3 dimensions that may be the problem for not converting CONV_2D.

joaosbastos avatar Aug 13 '21 15:08 joaosbastos