yolov9 the results of original onnx and end2end onnx are different

the results of original onnx and end2end onnx are different

Open demuxin opened this issue 3 months ago • 9 comments

@WongKinYiu Hi, I find the results of original onnx and end2end onnx are different, is this normal?

How to solve this issue?

Mar 27 '24 06:03 demuxin

After my further testing, only the models I've trained myself have this issue，but the model is exported in the same way, it's weird.

Mar 27 '24 07:03 demuxin

could you please provide more details or clarify the issue you’re facing?

Mar 27 '24 14:03 levipereira

I trained the yolov9 model using my own data, and then I converted the pt model to onnx model. I found a difference in the inference results of the onnx model whether or not to use end2end. Do you have any ideas for troubleshooting the problem?

Mar 29 '24 06:03 demuxin

Hi, @levipereira @WongKinYiu , For the end2end model, can you provide a way to see the implementation of the NMS? Because I noticed that the model always results in a few boxes missing, which should make a difference in the nms implementation.

Apr 01 '24 03:04 demuxin

Why did you want implement NMS? End2End already incorporates Efficient NMS as a built-in plugin. You can try change End2End parameters --topk-all --iou-thres
--conf-thres

Check this https://github.com/levipereira/triton-server-yolo/releases/tag/v0.0.1 https://github.com/levipereira/triton-server-yolo/?tab=readme-ov-file#evaluation-test-on-tensorrt

Apr 07 '24 16:04 levipereira

Because I will use yolov9 model on other platform that doesn't support TensorRT, I can't use end2end model.

I use yolov9 model with EfficientNMS_TRT to inference, and I've found that certain boxes won't select the category with the highest confidence, but rather the category with the second highest confidence.

Is this likely to happen? what is the internal processing logic of EfficientNMS_TRT?

for example:

These are the first five categories of confidence for the three boxes, but the output category for EfficientNMS_TRT is 5, not 0.

anchor1: 0.273078 0.000085 0.000005 0.000101 0.000071 0.266680
anchor2: 0.448257 0.000038 0.000004 0.000012 0.000069 0.255931
anchor3: 0.378077 0.000077 0.000005 0.000034 0.000093 0.254360

Apr 08 '24 01:04 demuxin

https://github.com/NVIDIA/TensorRT/tree/32c64a324e58f252eae4e5681f5c39dbe22ef2d5/plugin/efficientNMSPlugin

did you check plugin?

Apr 18 '24 02:04 levipereira

yes, I've seen the source code for this efficientNMSPlugin，

I know what the EfficientNMSFilter in the EfficientNMS plugin does is, get the category with the highest confidence for each anchor and get filtered out by the threshold.

This is the code of EfficientNMSFilter:

template <typename T>
__global__ void EfficientNMSFilter(EfficientNMSParameters param, const T* __restrict__ scoresInput,
    int* __restrict__ topNumData, int* __restrict__ topIndexData, int* __restrict__ topAnchorsData,
    T* __restrict__ topScoresData, int* __restrict__ topClassData)
{
    int elementIdx = blockDim.x * blockIdx.x + threadIdx.x;
    int imageIdx = blockDim.y * blockIdx.y + threadIdx.y;

    // Boundary Conditions
    if (elementIdx >= param.numScoreElements || imageIdx >= param.batchSize)
    {
        return;
    }

    // Shape of scoresInput: [batchSize, numAnchors, numClasses]
    int scoresInputIdx = imageIdx * param.numScoreElements + elementIdx;

    // For each class, check its corresponding score if it crosses the threshold, and if so select this anchor,
    // and keep track of the maximum score and the corresponding (argmax) class id
    T score = scoresInput[scoresInputIdx];
    if (gte_mp(score, (T) param.scoreThreshold))
    {
        // Unpack the class and anchor index from the element index
        int classIdx = elementIdx % param.numClasses;
        int anchorIdx = elementIdx / param.numClasses;

        // If this is a background class, ignore it.
        if (classIdx == param.backgroundClass)
        {
            return;
        }

        // Use an atomic to find an open slot where to write the selected anchor data.
        if (topNumData[imageIdx] >= param.numScoreElements)
        {
            return;
        }
        int selectedIdx = atomicAdd((unsigned int*) &topNumData[imageIdx], 1);
        if (selectedIdx >= param.numScoreElements)
        {
            topNumData[imageIdx] = param.numScoreElements;
            return;
        }

        // Shape of topScoresData / topClassData: [batchSize, numScoreElements]
        int topIdx = imageIdx * param.numScoreElements + selectedIdx;

        if (param.scoreBits > 0)
        {
            score = add_mp(score, (T) 1);
            if (gt_mp(score, (T) (2.f - 1.f / 1024.f)))
            {
                // Ensure the incremented score fits in the mantissa without changing the exponent
                score = (2.f - 1.f / 1024.f);
            }
        }

        topIndexData[topIdx] = selectedIdx;
        topAnchorsData[topIdx] = anchorIdx;
        topScoresData[topIdx] = score;
        topClassData[topIdx] = classIdx;
    }
}

Can you explain how EfficientNMSFilter selects the category with the highest confidence for anchar?

Apr 18 '24 08:04 demuxin

https://github.com/NVIDIA/TensorRT/tree/28733f0fdccde2967fed395b06ca491af3a561a9/plugin/efficientNMSPlugin#limitations

Apr 18 '24 20:04 levipereira

yolov9 yolov9 copied to clipboard

the results of original onnx and end2end onnx are different

yolov9
yolov9 copied to clipboard