tensorrt_inference icon indicating copy to clipboard operation
tensorrt_inference copied to clipboard

When converting to onnx, can the output be divided into information about classes and information about boxes?

Open leeyunhome opened this issue 4 years ago • 9 comments

Hello,

Thanks to you, I am looking at good data. image When converting to onnx, can it be divided into class information and boxes information in the form shown on the left? If possible, can you tell me how?

Thank you.

leeyunhome avatar Feb 09 '21 02:02 leeyunhome

Hello,

Thanks to you, I am looking at good data. image When converting to onnx, can it be divided into class information and boxes information in the form shown on the left? If possible, can you tell me how?

Thank you.

Hello, @linghu8812

image 107 seems to be my number of classes, 102 + 5 How did the 25200 in the middle come from?

Thank you.

leeyunhome avatar Feb 09 '21 09:02 leeyunhome

((640 / 8) * (640 / 8) + (640 / 16) * (640 / 16) + (640 / 32) * (640 / 32)) * 3

linghu8812 avatar Feb 09 '21 09:02 linghu8812

Hello,

image

By changing the parameters of the torch.onnx.export function, I got classes and boxes as follows.

Why can the dimensions of classes and boxes explain this?

And when inferring by loading this onnx in tensorRT, can we ignore feature maps like 660 and 969? Or do I have to handle it separately?

Thank you.

leeyunhome avatar Feb 14 '21 23:02 leeyunhome

((640 / 8) * (640 / 8) + (640 / 16) * (640 / 16) + (640 / 32) * (640 / 32)) * 3

Hello,

I don't understand this part, can you explain it in words? I don't understand the part where 1, 3, 640, 640 tensors are entered and 1, 25200, 107 appear as outputs.

Thank you.

leeyunhome avatar Feb 15 '21 00:02 leeyunhome

((640 / 8) * (640 / 8) + (640 / 16) * (640 / 16) + (640 / 32) * (640 / 32)) * 3

I haven't understood this explanation yet. Can you explain?

leeyunhome avatar Mar 12 '21 07:03 leeyunhome

@leeyunhome there are three feature maps output from the net, the 640 / 8 means the width or height of the output feature map.

linghu8812 avatar Mar 12 '21 08:03 linghu8812

@leeyunhome there are three feature maps output from the net, the 640 / 8 means the width or height of the output feature map.

Hello

8, 16, 32 aren't stride sizes?

I still don't understand well.

Could you please let me know if there is any material to read to understand this? Thank you.

leeyunhome avatar Mar 12 '21 09:03 leeyunhome

Hi, I have similar problem with Scaled Yolov4. I run model as tensorrt in Triton Inference Server and output of the model is with shape: [1, 65856, 85].

platform: "tensorrt_plan"                                               
max_batch_size: 1                                                                             
input {                                                            
  name: "images"                                                                          
  data_type: TYPE_FP32                                                   
  dims: 3                                                                                            
  dims: 896                                                                                  
  dims: 896                                                                                                                                                                              
}                                                                                                                                                                                        
output {                                                                                                                                                                                 
  name: "output"                                                                                                                                                                         
  data_type: TYPE_FP32                                                                                                                                                                   
  dims: 65856                                                                                                                                                                            
  dims: 85                                                                                                                                                                               
}                                                                                                                                                                                        
default_model_filename: "model.plan"

How can i get boxes and classes? Thank you.

sakulh avatar Mar 18 '21 15:03 sakulh

@leeyunhome yes 8,16,32 are strides. Go through the paper rather than asking doubts here. The issues must be mostly related to the code that's implemented. If you have any doubts regarding the logic of that implementation, I suggest you read papers or other blogs regarding scaledyolov4. ((640 / 8) * (640 / 8) + (640 / 16) * (640 / 16) + (640 / 32) * (640 / 32)) * 3 for each stride, the grid cell is created from the input and you get 3 outputs(3 anchor boxes) for cell in that grid. You can watch Aladdin Persson's yolov1 video to understand what grid cell is and read yolov2 to understand what anchor boxes are and finally if you have a doubt regarding the concatenation of all these outputs, you definitely have to read scaledyolov4 paper.

bobbilichandu avatar Mar 18 '21 15:03 bobbilichandu