yolov5
yolov5 copied to clipboard
TFLite+NMS and TFLite+agnosticNMS models do not output class information
Search before asking
- [X] I have searched the YOLOv5 issues and discussions and found no similar questions.
Question
I exported two TFLite models, one with with NMS and another with agnosticNMS by
python export.py --weights /path/to/weights/best.pt --include tflite --img-size 640 640 --nms
and
python export.py --weights /path/to/weights/best.pt --include tflite --img-size 640 640 --agnostic-nms
respectively.
The output shapes and example outputs I get when I input a single image are:
torch.Size([1, 100, 4]) # NMS
tensor([[[204.64380, 83.38014, 595.74396, 557.17212],
[449.35837, 100.27388, 639.56506, 557.58087],
[ 0.00000, 0.00000, 0.00000, 0.00000],
[ 0.00000, 0.00000, 0.00000, 0.00000],
[ 0.00000, 0.00000, 0.00000, 0.00000],
[ 0.00000, 0.00000, 0.00000, 0.00000],
...
]]], device='cuda:0')
torch.Size([1, 100]) # agnostic NMS
tensor([[6400., 6400., 7040., 7040., 11., 11., 11., 11., 11., 11., 11., 11., 11., ...]], device='cuda:0')
However, according to (https://www.tensorflow.org/api_docs/python/tf/image/combined_non_max_suppression) this line of code:
https://github.com/ultralytics/yolov5/blob/c215878f11d81808dfe4721795f0c105200e6601/models/tf.py#L448
should return:
'nmsed_boxes' | A [batch_size, max_detections, 4] float32 tensor containing the non-max suppressed boxes. |
---|---|
'nmsed_scores' | A [batch_size, max_detections] float32 tensor containing the scores for the boxes. |
'nmsed_classes' | A [batch_size, max_detections] float32 tensor containing the class for boxes. |
'valid_detections' | A [batch_size] int32 tensor indicating the number of valid detections per batch item. Only the top valid_detections[i] entries in nms_boxes[i], nms_scores[i] and nms_class[i] are valid. The rest of the entries are zero paddings. |
but it seems that the only values that are returned are bboxes. So,
- Where do I get the class information from? Why is this not part of the TFLite+NMS nor TFLite+agnosticNMS output? What am I missing here?
- Why is the output different for TFLite+NMS and TFLite+agnosticNMS, for the exact same image?
Additional
No response
@mikel-brostrom 👋 Hello! Thanks for asking about handling inference results. YOLOv5 🚀 PyTorch Hub models allow for simple model loading and inference in a pure python environment without using detect.py
, including for TF and TFLite inference by passing a local TF model, i.e.
model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s_fp16.tflite')
Simple Inference Example
This example loads a pretrained YOLOv5s model from PyTorch Hub as model
and passes an image for inference. 'yolov5s'
is the YOLOv5 'small' model. For details on all available models please see the README. Custom models can also be loaded, including custom trained PyTorch models and their exported variants, i.e. ONNX, TensorRT, TensorFlow, OpenVINO YOLOv5 models.
import torch
# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s') # yolov5n - yolov5x6 official model
# 'custom', 'path/to/best.pt') # custom model
# Images
im = 'https://ultralytics.com/images/zidane.jpg' # or file, Path, URL, PIL, OpenCV, numpy, list
# Inference
results = model(im)
# Results
results.print() # or .show(), .save(), .crop(), .pandas(), etc.
results.xyxy[0] # im predictions (tensor)
results.pandas().xyxy[0] # im predictions (pandas)
# xmin ymin xmax ymax confidence class name
# 0 749.50 43.50 1148.0 704.5 0.874023 0 person
# 2 114.75 195.75 1095.0 708.0 0.624512 0 person
# 3 986.00 304.00 1028.0 420.0 0.286865 27 tie
results.pandas().xyxy[0].value_counts('name') # class counts (pandas)
# person 2
# tie 1
data:image/s3,"s3://crabby-images/4f729/4f729f1d2d99d9b2184318a8111d082b283bb624" alt=""
See YOLOv5 PyTorch Hub Tutorial for details.
Good luck 🍀 and let us know if you have any other questions!
Sorry, @glenn-jocher. But this has nothing to do with my question. Which is about TFLite models exported with --nms
or --agnostic-nms
Hi @mikel-brostrom ! When you use yolov5/detect.py script, only the first output of tflite model is used In the models/common.py and it is the box locations you mentioned above. But not only boxes but any other outputs are included in the model output. You can check them like this.
output_details = interpreter.get_output_details()
print(output_details)
[{'name': 'StatefulPartitionedCall:0', 'index': 877, 'shape': array([ 1, 100, 4], dtype=int32), 'shape_signature': array([ 1, 100, 4], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:5', 'index': 831, 'shape': array([ 1, 1600, 3, 20], dtype=int32), 'shape_signature': array([ 1, 1600, 3, 20], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:3', 'index': 880, 'shape': array([1], dtype=int32), 'shape_signature': array([1], dtype=int32), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:6', 'index': 816, 'shape': array([ 1, 400, 3, 20], dtype=int32), 'shape_signature': array([ 1, 400, 3, 20], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), '
...
I don't know the reason of the order of outputs but you can use the class and score by using tflite interpreter directly like this.
note: I checked it with --nms
option only. Maybe not working with --agnostic-nms
.
interpreter = Interpreter(model_path=model_path)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
interpreter.set_tensor(input['index'], input_data)
interpreter.invoke()
scores = interpreter.get_tensor(output_details[-1]['index'])[0]
classes = interpreter.get_tensor(output_details[-2]['index'])[0]
👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.
Access additional YOLOv5 🚀 resources:
- Wiki – https://github.com/ultralytics/yolov5/wiki
- Tutorials – https://docs.ultralytics.com/yolov5
- Docs – https://docs.ultralytics.com
Access additional Ultralytics ⚡ resources:
- Ultralytics HUB – https://ultralytics.com/hub
- Vision API – https://ultralytics.com/yolov5
- About Us – https://ultralytics.com/about
- Join Our Team – https://ultralytics.com/work
- Contact Us – https://ultralytics.com/contact
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!
@YujiOshima There's an issue discussing the output recording of converted TFLite models if you're interested.