TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

TensorRT INT8 calibration python API

Open HarrySm opened this issue 2 years ago • 9 comments

Hello, Could you please help me here, it seems that they abanded me. Please have a look below :

https://forums.developer.nvidia.com/t/tensorrt-int8-calibration-python-api/227297

Thank you in advance.

Best regards, Harry

HarrySm avatar Sep 13 '22 09:09 HarrySm

can I use the same scripts to first generate a quatify with int8 calibrated engine and second run the validation to any classification model for example resnet18, squeezenet, etc…

I'm not sure about it but sometimes different model has different pre-processing, just to make sure you generate the correct inputs. I would recommend to use https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy/examples/cli/convert/01_int8_calibration_in_tensorrt. all you need to do is prepare a data_loader.py that do the correct pre-processing.

I would like to do the exact same thing for detection models using this repo here which is dedicated for EfficientDet. So it is possible to use it for others, for example yolov5.

Same as above, just make sure your pre-processing is correct.

zerollzeng avatar Sep 13 '22 10:09 zerollzeng

Is that normal that we have a drop of 3% accuracy from full precision FP32 to INT8?

It might be expected since the accuracy didn't drop much, you can see if help if you increase the number of images for calibration.

zerollzeng avatar Sep 13 '22 10:09 zerollzeng

Thank you very much @zerollzeng for your reply. I will try the data_loader.py and see how it works to get the correct pre-processing, I will let you know about the results.

HarrySm avatar Sep 13 '22 12:09 HarrySm

Hello @zerollzeng ,

I tried to run the object detection samples which does not need an images pre-processing. It works fine with EfficientDet-D0 and here is the final output below. I am working now to see how I can find the Top1 accuracy from the mAP.

However, using yolov5n model is ending with an error. Please have a look of the results of EfficinetDet and the Error of yolov5.

NOTE:

I am using the eval_coco.py to run the validation and get this.

Thanks.

EfficinetDet results:

loading annotations into memory...
Done (t=1.86s)
creating index...
index created!
Loading and preparing results...
Converting ndarray to lists...
(495840, 7)
0/495840
DONE (t=10.34s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=133.74s).
Accumulating evaluation results...
DONE (t=36.88s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.311
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.482
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.328
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.110
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.360
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.506
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.274
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.424
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.449
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.174
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.531
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.671

yolov5 error:

Traceback (most recent call last):
  File "/path/to/TensorRT/samples/python/efficientdet/eval_coco.py", line 79, in <module>
    main(args)
  File "/path/to/TensorRT/samples/python/efficientdet/eval_coco.py", line 42, in main
    detections = trt_infer.infer(batch, scales, args.nms_threshold)
  File "/path/to/TensorRT/samples/python/efficientdet/infer.py", line 123, in infer
    boxes = outputs[1]
IndexError: list index out of range

HarrySm avatar Sep 13 '22 15:09 HarrySm

I am using the eval_coco.py to run the validation and get this.

Maybe you just can't apply efficientdet's eval scripts to yolov5 directly. Please solve it on your own because it's out of TensorRT's scope :-)

zerollzeng avatar Sep 14 '22 05:09 zerollzeng

Maybe you just can't apply efficientdet's eval scripts to yolov5 directly. Please solve it on your own because it's out of TensorRT's scope :-)

Hello @zerollzeng ,

I see so it is only for efficinetdet because it uses automl which has only scripts to efficientdet.

However, I want to solve it on my own which is not clear how to do it with TensorRT, if you have a general script it would be great. I have already red the online documentation and it is not clear how you implement it specially the calibrator entropy2.

The online documentation has only the class and the methodes to use even here as well

Question:

So I need a script to quatify yolov5 with INT8 calibration, then run the validation on the COCO dataset as the efficientdet scripts do.

Thank you :)

HarrySm avatar Sep 14 '22 07:09 HarrySm

Hello @zerollzeng ,

I have an other question about your suggestion here i am not sure but it seems to me that this scripts will generate me a fake calibration cache.

Question 1 :

So what is the difference between a fake calibration cache and a not fake calibration cache?.

Question 2 :

here

Question: So I need a script to quatify yolov5 with INT8 calibration, then run the validation on the COCO dataset as the efficientdet scripts do.

I have found that maybe we can generate a calibrated yolov5 engine using the script of EfficientDet build_engine.py. Could you please confirm that to me?

NOTE:

The main reason I am calibrating is that because I am quatifying my network in INT8 and after that I will run the validation on the validation dataset to get the Top1 accuracy. Does the fake calibration affect my results in accuracy?

HarrySm avatar Sep 14 '22 16:09 HarrySm

So what is the difference between a fake calibration cache and a not fake calibration cache?.

the fake calibration cache is only used for test the performance, to get the accuracy metric, you need the REAL calibration cache.

I have found that maybe we can generate a calibrated yolov5 engine using the script of EfficientDet build_engine.py. Could you please confirm that to me?

I don't know the answer because I haven't done this before. But calibration is just about feed the real inputs and select the right calibration algorithm(in you case EntrophyV2 is the best choice for CNN), so it doesn't matter how do you implement the calibration interface. you can write it on your own, or borrow others code, or use polygraphy, it doesn't matter, just make sure the inputs you feed for calibration is the same as you feed for training and inference.

zerollzeng avatar Sep 14 '22 16:09 zerollzeng

Thank you very much @zerollzeng for the explanations.

the EfficientDet script seems to me good for yolov5 calibration it uses COCO as input and the ONNX model then it generates a TensorRT engine.

However, for the image classifacation I have found this script from Ryan McCormick that it does the exact same thing with EfficientDet but instead, it takes imageNet as input and generates a real calibration cache. For the tensorRT engine, it will not generate it because of 'dump core' error. So I will generate the engine using trtexec and the real calibration file generated from Ryan McCormick's script.

I will let you know about my result :)

HarrySm avatar Sep 15 '22 07:09 HarrySm

closing since no activity for more than 14 days, please reopen if you still have question, thanks!

ttyio avatar Dec 12 '22 07:12 ttyio