Inconsistent TFLite Model Results Between detect.py and Custom Inference Code
Search before asking
- [X] I have searched the YOLOv5 issues and discussions and found no similar questions.
Question
Description:
Hi,
I've converted a YOLOV5s model to a Tflite model using the export.py script provided in YOLOv5. The output came with the name (best-fp16.tflite). It works well when used with the official detect.py script. The predictions are accurate, with good bounding box alignment and confidence scores.
However, when I perform inference using my custom Python code, the results are noticeably worse:
- Bounding boxes are misaligned.
- Confidence scores are significantly lower.
here is the detect.py output
here is the custom code output:
know the labels in the output are incorrect (e.g., "person") because I forgot to update coco.yaml, but the main issue lies in the quality of the detections.
the input shape is [1, 640, 640, 3] the output shape is [1, 25200, 7]
here is my custom code for detection:
`
import cv2
import numpy as np
import tensorflow as tf
# Load TFLite model
interpreter = tf.lite.Interpreter(model_path="best-fp16.tflite")
interpreter.allocate_tensors()
# Get input and output details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Function to preprocess the image
def preprocess_image(image_path, input_size=(640, 640)):
# Load the image
image = cv2.imread(image_path)
if image is None:
raise ValueError("Image not found or could not be loaded")
# Convert image to RGB
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Resize image to the model input size (640x640)
image_resized = cv2.resize(image, input_size)
# Normalize pixel values to [0, 1]
image_resized = image_resized / 255.0
# Add batch dimension (shape becomes [1, 640, 640, 3])
input_image = np.expand_dims(image_resized, axis=0).astype(np.float32)
return input_image, image # Return processed image and original image
# Function to decode raw YOLOv5 output
def decode_output(output_data, input_size, confidence_threshold=0.2, iou_threshold=0.4):
boxes = output_data[..., :4] # Extract bounding box data
confidence = output_data[..., 4] # Extract confidence scores
class_probs = output_data[..., 5:] # Extract class probabilities
# Compute final confidence scores
scores = confidence[..., np.newaxis] * class_probs
# Filter predictions based on confidence threshold
valid_detections = np.where(scores.max(axis=-1) > confidence_threshold)
boxes = boxes[valid_detections]
scores = scores[valid_detections]
class_ids = np.argmax(scores, axis=-1)
# Scale boxes to input size (assumes input_size is square)
input_h, input_w = input_size
boxes[:, 0] *= input_w # Scale x_center
boxes[:, 1] *= input_h # Scale y_center
boxes[:, 2] *= input_w # Scale width
boxes[:, 3] *= input_h # Scale height
# Convert boxes from (x_center, y_center, width, height) to (xmin, ymin, xmax, ymax)
boxes[:, 0] = boxes[:, 0] - boxes[:, 2] / 2 # xmin
boxes[:, 1] = boxes[:, 1] - boxes[:, 3] / 2 # ymin
boxes[:, 2] = boxes[:, 0] + boxes[:, 2] # xmax
boxes[:, 3] = boxes[:, 1] + boxes[:, 3] # ymax
# Perform Non-Maximum Suppression (NMS)
nms_indices = tf.image.non_max_suppression(
boxes,
scores.max(axis=-1),
max_output_size=50, # Max number of detections
iou_threshold=iou_threshold,
score_threshold=confidence_threshold
).numpy()
# Return final filtered boxes, scores, and class IDs
return boxes[nms_indices], scores[nms_indices].max(axis=-1), class_ids[nms_indices]
# Function to draw bounding boxes on an image
def draw_boxes(image, boxes, scores, class_ids, input_size, rescale=False):
image_draw = image.copy()
if rescale:
# Rescale boxes to input size
original_h, original_w = image.shape[:2]
boxes[:, [0, 2]] *= (original_w / input_size[0]) # Scale X coordinates
boxes[:, [1, 3]] *= (original_h / input_size[1]) # Scale Y coordinates
# Draw bounding boxes
for box, score, class_id in zip(boxes, scores, class_ids):
if score > 0.2: # Debug: Lower threshold for easier testing
xmin, ymin, xmax, ymax = box.astype(int)
label = f"Class {int(class_id)}: {score:.2f}"
# Draw bounding box
cv2.rectangle(image_draw, (xmin, ymin), (xmax, ymax), (0, 255, 0), 2)
# Draw label
cv2.putText(image_draw, label, (xmin, ymin - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
return image_draw
## Function to run inference and draw bounding boxes
def predict_and_draw_boxes(image_path, interpreter, input_size=(640, 640)):
# Preprocess the image
input_image, original_image = preprocess_image(image_path, input_size)
print("Input image shape:", input_image.shape) # Debugging info
print("Original image shape:", original_image.shape) # Debugging info
# Set input tensor
interpreter.set_tensor(input_details[0]['index'], input_image)
# Run inference
interpreter.invoke()
# Get output tensor
output_data = interpreter.get_tensor(output_details[0]['index'])
print("Output data shape:", output_data.shape) # Debugging info
# Decode output
boxes, scores, class_ids = decode_output(output_data[0], input_size, confidence_threshold=0.2)
# Draw bounding boxes on the original image
original_boxes_image = draw_boxes(original_image, boxes, scores, class_ids, input_size, rescale=True)
# Save the image with the bounding boxes
original_output_path = "detection_original.jpg"
cv2.imwrite(original_output_path, original_boxes_image)
print(f"Image with detections saved as {original_output_path}")
# Run prediction and draw bounding boxes
predict_and_draw_boxes("test2.jpg", interpreter)
`
how can I get the same result as the detect.py, do I need to include any preprocessing for the image or postprocessing for the output?
I want to have the same result as the detect.py so that I can convert it to flutter and make detections from phones
Additional
No response
π Hello @yAlqubati, thank you for your interest in YOLOv5 π! It looks like you're experiencing differences between detect.py results and your custom inference script. This is a great question and an Ultralytics engineer will assist you soon!
For a π Bug Report, could you please confirm if detect.py and your custom code are using the same preprocessing and postprocessing logic? Differences here often lead to inconsistencies in results.
If not already done, providing a minimum reproducible example (MRE) with a simplified version of your test image, model, and code is immensely helpful for debugging.
Requirements
Ensure that you are using Python>=3.8.0 with all necessary libraries installed and that the environment is correctly set up with matching TensorFlow Lite configurations.
Additional Debugging Tips
- Verify that both the
detect.pyscript and your custom code are normalizing images in the same way (e.g., pixel value scaling to [0, 1], image resizing dimensions, etc.). - Check that the YOLOv5 model postprocessing steps, such as Non-Maximum Suppression (NMS), confidence thresholds, and class probability decoding, are consistent with YOLOv5's implementation.
- Confirm that your input tensor shape
([1, 640, 640, 3])matches what YOLOv5 expects. Ensure there are no deviations in input formats or scaling.
Once these items are cross-verified, aligning the results should be easier. An engineer will follow up shortly with further recommendations π.
@yAlqubati the difference in results between detect.py and your custom code is likely due to inconsistencies in image preprocessing and/or output postprocessing. YOLOv5's detect.py handles these details precisely, so you will need to align your custom code accordingly.
Suggestions:
-
Image Preprocessing:
- Ensure that the input image is normalized to
[0, 1]and matches the TFLite model's expected input format. - Verify that the input tensor is scaled appropriately using quantization parameters if the TFLite model is quantized (e.g., FP16 or INT8). You can check this by inspecting
input_details[0]['quantization'].
- Ensure that the input image is normalized to
-
Postprocessing:
- YOLOv5 outputs are in normalized (0-1)
x_center, y_center, width, heightformat. Ensure you correctly scale and convert these to(xmin, ymin, xmax, ymax)coordinates relative to the original image dimensions. - Use non-maximum suppression (NMS) parameters consistent with
detect.py. The confidence threshold and IoU threshold in your code (confidence_threshold=0.2,iou_threshold=0.4) may differ from the defaults indetect.py.
- YOLOv5 outputs are in normalized (0-1)
-
Debugging:
- Compare the input tensor passed to the TFLite model in your custom code with the one in
detect.pyto ensure they are identical. - Inspect the raw model outputs (before postprocessing) in both cases to confirm they match.
- Compare the input tensor passed to the TFLite model in your custom code with the one in
For reference, you can review the YOLOv5 TFLite inference example provided in the YOLOv5 TFLite Export Guide. It includes preprocessing and postprocessing steps that align with detect.py.
Let us know if you encounter further issues!
Thanks for the suggestions! I've verified that the image preprocessing follows your steps, and the model is FP32, not quantizedβwould this be an issue?
For postprocessing, Iβve ensured correct scaling and NMS parameters (confidence_threshold=0.2, iou_threshold=0.4). The input tensor shape ([1, 640, 640, 3]) is the same in both the custom code and detect.py. However, I couldn't print the output shape in detect.py. Could you provide guidance on how to resolve this or any other steps I should check?
Let me know if thereβs anything else I can try!
@yAlqubati you're on the right track, and the FP32 model should not cause issues. To inspect the output shape in detect.py, you can modify the script to print the output tensor shape after inference by adding a line like print(output.shape) (where output is the raw model output). Additionally, ensure your custom code applies YOLOv5's exact preprocessing and postprocessing steps, including the sigmoid activation on output values (if needed) and proper handling of anchor grids. If the issue persists, comparing raw outputs (before postprocessing) between detect.py and your code may help identify discrepancies. Let us know if further assistance is needed!
π Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
- Docs: https://docs.ultralytics.com
- HUB: https://hub.ultralytics.com
- Community: https://community.ultralytics.com
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO π and Vision AI β