mediapipe
mediapipe copied to clipboard
ImageEmbedderOption quantize behavior
Have I written custom code (as opposed to using a stock example script provided in MediaPipe)
Yes
OS Platform and Distribution
Raspberry pi 5
MediaPipe Tasks SDK version
No response
Task name (e.g. Image classification, Gesture recognition etc.)
ImageEmbedder
Programming Language and version (e.g. C++, Python, Java)
Python
Describe the actual behavior
The output from the custom tflite model with quantize set to False is slower than when it is set to True. In addition the output when the flag quantize = False the model output are floats, even if the real model output are specified to be uint8
Describe the expected behaviour
We expect that when quantize is false then the computation should be faster since it should not compute any additional operation and also we expect a uint8 output, since our tflite model when called with Interpreter returns an uint8 embedding
Standalone code/steps you may have used to try to get what you need
# Initialize the object detection model
base_options = python.BaseOptions(model_asset_path=model)
options = vision.ImageEmbedderOptions(base_options=base_options,
running_mode=vision.RunningMode.IMAGE,
l2_normalize = False, quantize = False,
)
detector = vision.ImageEmbedder.create_from_options(options)
# Continuously capture images from the camera and run inference
image = cv2.imread('carpet.png')
image = cv2.resize(image, (256, 256), interpolation=cv2.INTER_LINEAR)
image = cv2.flip(image, 1)
rgb_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
print(rgb_image)
mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=rgb_image)
for i in range(100):
# Run object detection using the model.
start = time.time()
em = detector.embed(mp_image)
print(time.time()-start)
embed = em.embeddings[0].embedding
print(np.unique(embed)) ## here the output is float32, while our model gives uint8 results
print(len(np.unique(embed)))
detector.close()
Other info / Complete Logs
No response
practically, my concern is to understand the impliances of using whether quantize set to True or false in ImageEmbedderOptions if i provide a custom .tflite model that already present a uint8 tensor as output. In this case i would like to perform dequantization manually since it appears that mediapipe cannot read .tflite quantization parameters
In our current pipeline, your model output has to match the output format of the models that our tasks are designed to handle. It seems pretty likely that we are just passing through quantized data from your model as floats. You might be able to read the data back as uint, but that is not officially supported.
This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.
This issue was closed due to lack of activity after being marked stale for past 7 days.