yolov5 icon indicating copy to clipboard operation
yolov5 copied to clipboard

ONNXRuntime-Cpp and ONNXRuntime python give different results:

Open devendraswamy opened this issue 1 year ago β€’ 6 comments
trafficstars

Search before asking

  • [X] I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

I am facing the problem with YOLOV5 model. While I am testing my Python ONNX code, all the bounding box (bbox) values are correct. However, when I perform the same process with my C++ code, I am getting incorrect bbox values.

the image processed in ptyhon code: image_data = np.expand_dims(image_data, axis=0) # Add batch dimension

and feed that image to python pyd file (c++ inference file complied to pyd)

auto output_tensors = session.Run(Ort::RunOptions{ nullptr }, input_names, &input_tensor, 1, output_names, 1);

Additional

complied or build C++ code is:

#include <onnxruntime_cxx_api.h> #include <pybind11/pybind11.h> #include <pybind11/numpy.h> #include #include #include #include

using namespace std; namespace py = pybind11;

class OnnxModel { public: OnnxModel(const std::string& model_path) : env(ORT_LOGGING_LEVEL_WARNING, "OnnxModel"), session(env, std::wstring(model_path.begin(), model_path.end()).c_str(), Ort::SessionOptions()) { Ort::AllocatorWithDefaultOptions allocator;

    // Get input and output names as Ort::AllocatedStringPtr
    Ort::AllocatedStringPtr input_name_alloc = session.GetInputNameAllocated(0, allocator);
    Ort::AllocatedStringPtr output_name_alloc = session.GetOutputNameAllocated(0, allocator);

    // Convert the Ort::AllocatedStringPtr to std::string using the get() method
    input_name = std::string(input_name_alloc.get());
    output_name = std::string(output_name_alloc.get());

    // Optional: Print the input and output names for debugging
    std::cout << "Input name: " << input_name << std::endl;
    std::cout << "Output name: " << output_name << std::endl;
}

// Accept a 4D numpy array: (batch_size, channels, height, width)
py::array_t<float> run(py::array_t<float> input_array) {
    // Request a buffer from the numpy array
    py::buffer_info buf = input_array.request();

    // Check that the input is indeed a 4-dimensional array
    if (buf.ndim != 4) {
        throw std::runtime_error("Input should be a 4-dimensional array (batch_size, channels, height, width)");
    }

    // Convert numpy array data to std::vector<float>
    std::vector<float> input_data(static_cast<float*>(buf.ptr), 
                                  static_cast<float*>(buf.ptr) + buf.size);

    // Run the inference
    return run_inf(input_data, {1, 3, 640, 640});  // Adjust shape based on your model's input
}

py::array_t<float> run_inf(const std::vector<float>& input_data, const std::array<int64_t, 4>& input_shape) {
    // Create input tensor
    Ort::MemoryInfo memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
    Ort::Value input_tensor = Ort::Value::CreateTensor<float>(
        memory_info,
        const_cast<float*>(input_data.data()),
        input_data.size(),
        input_shape.data(),
        input_shape.size()
    );

    // Prepare input and output names
    const char* input_names[] = { input_name.c_str() };
    const char* output_names[] = { output_name.c_str() };

    // Run the model
    auto output_tensors = session.Run(Ort::RunOptions{ nullptr }, input_names, &input_tensor, 1, output_names, 1);

    // Get the output data
    float* output_data = output_tensors[0].GetTensorMutableData<float>();
    size_t output_count = output_tensors[0].GetTensorTypeAndShapeInfo().GetElementCount();

    // Create a numpy array from the output data
    return py::array_t<float>(output_count, output_data);
}

private: Ort::Env env; Ort::Session session; std::string input_name; std::string output_name; };

PYBIND11_MODULE(onnx_loader, m) { py::class_<OnnxModel>(m, "OnnxModel") .def(py::init<const std::string&>()) .def("run", &OnnxModel::run); }

Image feeding from python code:

Function to preprocess the image

def preprocess_image(image_path, input_size=(640, 640)): # Load the image using OpenCV image = cv2.imread(image_path, cv2.IMREAD_COLOR) # Load image in color mode if image is None: raise ValueError(f"Could not open or find the image: {image_path}") # Convert from BGR to RGB format image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Resize the image to match the input size expected by the model image = cv2.resize(image, input_size) # Normalize the image to [0, 1] range image = image.astype(np.float32) / 255.0 # Convert to float and normalize # Rearrange the image to CHW format (1, C, H, W) image_data = np.transpose(image, (2, 0, 1)) # Convert to CHW format image_data = np.expand_dims(image_data, axis=0) # Add batch dimension print(f"Image preprocessed: type = {type(image_data)}, shape = {image_data.shape}") return image_data, image # Return the preprocessed image data

devendraswamy avatar Oct 01 '24 09:10 devendraswamy

πŸ‘‹ Hello @devendraswamy, thank you for reaching out with your issue regarding YOLOv5 πŸš€! This is an automated response to guide you further, and an Ultralytics engineer will be with you soon.

Please make sure you are following our Tutorials to ensure your setup is correct. They provide a helpful starting point for concepts including Custom Data Training and Hyperparameter Evolution.

If this is a πŸ› Bug Report, please provide a minimum reproducible example so we can better assist you.

In your case, ensure that both Python and C++ environments use the same preprocessing steps and ONNX model settings. Discrepancies could lead to different outputs.

Requirements

Ensure you have Python>=3.8.0 with all requirements.txt installed and are using PyTorch>=1.8. To set up:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 can be run in various environments. Consider using these resources with dependencies preinstalled, including CUDA/CUDNN, Python, and PyTorch:

Status

YOLOv5 CI

A green badge means all YOLOv5 GitHub Actions Continuous Integration (CI) tests are passing. These tests verify correct operation on macOS, Windows, and Ubuntu.

Introducing YOLOv8 πŸš€

Check out YOLOv8, our latest model designed for superior performance in object detection, segmentation, and classification. Get started with:

pip install ultralytics

Feel free to provide more details as needed. We'll get back to you soon! 😊

UltralyticsAssistant avatar Oct 01 '24 09:10 UltralyticsAssistant

@devendraswamy it seems like the discrepancy might be due to differences in preprocessing or input tensor handling between your Python and C++ implementations. Ensure that both environments use the same image preprocessing steps and input tensor shapes. Additionally, verify that the ONNX model and runtime versions are consistent across both implementations. If the issue persists, consider checking the ONNX model's input and output names and dimensions in both environments to ensure they match.

pderrenger avatar Nov 09 '24 12:11 pderrenger

hello,Can you provide a copy of the Python version of the code for ort inference? I have some questions on my side

ZCzzzzzz avatar Dec 17 '24 06:12 ZCzzzzzz

@ZCzzzzzz hello, the Python ONNX inference code for YOLOv5 is already available in the repository. You can refer to the export.py script, which includes ONNX export and inference examples. If you have specific questions, feel free to share more details so we can assist further.

pderrenger avatar Dec 17 '24 09:12 pderrenger

I've converted yolov5s.pt to yolov5s.onnx and now my python code looks like this: `import os import cv2 import numpy as np import onnxruntime as ort from pathlib import Path from tqdm import tqdm from pycocotools.coco import COCO from pycocotools.cocoeval import COCOeval from utils.general import coco80_to_coco91_class import json

DATASET_PATH = '/COCO2017'
MODEL_PATH = './yolov5s.onnx'
IMG_SIZE = 640
CONF_THRESH = 0.1
IOU_THRESH = 0.6

data_paths = { "train_images": os.path.join(DATASET_PATH, "train2017.txt"), "val_images": os.path.join(DATASET_PATH, "val2017.txt"), "annotations_train": os.path.join(DATASET_PATH, "annotations", "instances_train2017.json"), "annotations_val": os.path.join(DATASET_PATH, "annotations", "instances_val2017.json"), }

def xywh2xyxy(x): y = np.copy(x) y[..., 0] = x[..., 0] - x[..., 2] / 2 y[..., 1] = x[..., 1] - x[..., 3] / 2 y[..., 2] = x[..., 0] + x[..., 2] / 2 y[..., 3] = x[..., 1] + x[..., 3] / 2 return y

def xyxy2xywh(x): y = np.copy(x) y[..., 0] = (x[..., 0] + x[..., 2]) / 2 y[..., 1] = (x[..., 1] + x[..., 3]) / 2 y[..., 2] = x[..., 2] - x[..., 0] y[..., 3] = x[..., 3] - x[..., 1] return y

def non_max_suppression(prediction, conf_thres=0.25, iou_thres=0.45, max_det=300): xc = prediction[..., 4] > conf_thres output = [np.zeros((0, 6))] * prediction.shape[0] for xi, x in enumerate(prediction): x = x[xc[xi]] if not x.shape[0]: continue x[:, 5:] *= x[:, 4:5] box = xywh2xyxy(x[:, :4]) conf = x[:, 4] j = np.argmax(x[:, 5:], axis=1) x = np.concatenate((box, conf[:, None], j[:, None]), axis=1)[conf > conf_thres]

    if not x.shape[0]:
        continue
    c = x[:, 5:6] * 4096
    boxes, scores = x[:, :4] + c, x[:, 4]
    i = cv2.dnn.NMSBoxes(boxes.tolist(), scores.tolist(), conf_thres, iou_thres)
    output[xi] = x[i].reshape(-1, 6)[:max_det]
return output

def preprocess_image(image_path): img = cv2.imread(image_path) img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) img = cv2.resize(img, (IMG_SIZE, IMG_SIZE), interpolation=cv2.INTER_LINEAR) img = img.transpose(2, 0, 1).astype(np.float32) img /= 255.0 return np.expand_dims(img, axis=0)

def infer_with_onnxruntime(session, img_tensor): input_name = session.get_inputs()[0].name outputs = session.run(None, {input_name: img_tensor}) return outputs[0]

def save_coco_results(predictions, image_ids, coco, output_file): results = []

class_map = {i: cat_id for i, cat_id in enumerate(sorted(coco.getCatIds()))}

class_map = coco80_to_coco91_class()  

print("Class map:", class_map)

for preds, img_id in zip(predictions, image_ids):
    for pred in preds[0]:

print("Shape of pred:", pred.shape)

        box = pred[:4]
        conf = pred[4]

print("Value of pred[5]:", pred[5])

        cls = int(pred[5])
        if cls >= len(class_map):
            print(f"Skipping invalid class index: {cls}")
            continue

category_id = class_map.get(cls, None)

        category_id = class_map[cls]

if category_id is None:

print(f"Unknown class ID: {cls}, skipping...")

continue

        print(f"Image ID: {img_id}, Category ID: {category_id}, Box: {box}, Score: {conf}")
        print(f"Predicted COCO80 cls: {cls}, Mapped COCO91 category_id: {category_id}")
        box = xyxy2xywh(np.array(box).reshape(1, 4))[0]
        box = [max(0, round(x, 3)) for x in box]
        results.append({
            "image_id": int(img_id),
            "category_id": category_id,
            "bbox": box,
            "score": round(conf, 5)
        })
with open(output_file, 'w') as f:
    json.dump(results, f)

def run_inference(images_file, annotations_file, session, dataset_name): coco = COCO(annotations_file) with open(images_file) as f: image_paths = [line.strip() for line in f.readlines()] predictions, image_ids = [], []

for img_path in tqdm(image_paths, desc=f"inference: {dataset_name}"):
    img_tensor = preprocess_image(os.path.join(DATASET_PATH, img_path))
    preds = infer_with_onnxruntime(session, img_tensor)
    preds = non_max_suppression(preds, conf_thres=CONF_THRESH, iou_thres=IOU_THRESH)

    predictions.append(preds)
    image_ids.append(Path(img_path).stem)

output_file = f"coco_predictions_{dataset_name}.json"
save_coco_results(predictions, image_ids, coco, output_file)
coco_dt = coco.loadRes(output_file)
coco_eval = COCOeval(coco, coco_dt, iouType="bbox")
coco_eval.evaluate()
coco_eval.accumulate()
coco_eval.summarize()

def main(): session = ort.InferenceSession(MODEL_PATH, providers=["CUDAExecutionProvider"])

run_inference(data_paths["train_images"], data_paths["annotations_train"], session, "train2017")

run_inference(data_paths["val_images"], data_paths["annotations_val"], session, "val2017")

if name == "main": main()` This code uses onnxruntime for inference, but the final result is 0,like this:" Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000"

ZCzzzzzz avatar Dec 17 '24 09:12 ZCzzzzzz

It seems like the ONNX model is not producing valid predictions, leading to zero evaluation metrics. Here are a few suggestions to debug the issue:

  1. Verify Model Export: Ensure the ONNX model was exported correctly using the export.py script in the YOLOv5 repo. Use the following command and test the ONNX model after export:

    python export.py --weights yolov5s.pt --include onnx
    
  2. Input Preprocessing: Confirm that the image preprocessing in your inference script matches the preprocessing used during training/export. For YOLOv5, ensure the input image is resized to (640, 640), normalized to [0, 1], and converted to channel-first format.

  3. Non-Maximum Suppression (NMS): Check the implementation of your non_max_suppression function. It seems custom, and any errors in filtering detections could result in no valid predictions.

  4. Debug Predictions: Print the raw outputs from the ONNX model (outputs[0]) before applying NMS. Verify if valid bounding boxes, confidence scores, and class predictions are being produced.

  5. Class Mapping: Examine the correctness of the coco80_to_coco91_class() mapping. Any mismatch could result in invalid category IDs, which might cause the COCO evaluation to fail.

  6. ONNX Runtime Providers: Ensure you are using the correct hardware provider (CUDAExecutionProvider for GPU or CPUExecutionProvider for CPU). Mismatched providers can cause inference issues.

If the issue persists after these checks, try validating the ONNX model with the detect.py script in the YOLOv5 repo:

python detect.py --weights yolov5s.onnx --source path/to/image.jpg --img 640

Let us know if further assistance is needed!

pderrenger avatar Dec 17 '24 12:12 pderrenger