yolov7 Add YOLOv7 quantization example

We used PaddleSlim ACT(Auto Compression Toolkit) to quantize and compress YOLOv7, and the results on T4 are as follows:

Model	Base mAP^val 0.5:0.95	Quant mAP^val 0.5:0.95	Latency^FP32	Latency^FP16	Latency^INT8
YOLOv7	51.2	50.9	26.84ms	7.44ms	4.55ms
YOLOv7-Tiny	37.3	37.0	5.06ms	2.32ms	1.68ms

Please @WongKinYiu @AlexeyAB review, thx.

Aug 24 '22 09:08 yghstill

@nemonameless @HeungJunKim @Errol-golang @NicholasZolton @HeungJunKim @Errol-golang @akashAD98 @adujardin @eshoyuan @UNeedCryDear hi，this pr is about Source-Free compression training on YOLOv7，Just interested can try it！

Sep 02 '22 15:09 leiqing1

@yghstill Hi. Can you guide me about the compression of yolo-w6-pose model?

Sep 07 '22 00:09 pytholic

@yghstill Hi. Can you guide me about the compression of yolo-w6-pose model?

@pytholic I will try to compress yolov7-w6-pose.

Sep 07 '22 03:09 yghstill

@yghstill Hi. Can you guide me about the compression of yolo-w6-pose model?

@pytholic I will try to compress yolov7-w6-pose.

@yghstill Thank you, will look forward to it!

Sep 07 '22 04:09 pytholic

@yghstill I was trying to do it myself, but turns out that Paddle does not support Ubuntu 22.04 at the moment.

Sep 07 '22 06:09 pytholic

hi @pytholic Paddle support Ubuntu22.04 in develop branch. you can install paddlepaddle referring to this link (https://www.paddlepaddle.org.cn/en) e.g.: python -m pip install paddlepaddle-gpu==0.0.0.post112 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html

Sep 07 '22 09:09 leiqing1

@WongKinYiu hi KinYiu , do you have any suggestions on this PR of automatic compression technology on YOLOv7？ https://github.com/WongKinYiu/yolov7/pull/612

Sep 08 '22 05:09 leiqing1

I think it really great, just I am not familiar to git merge PR, so wait for Alexey to check it.

Sep 08 '22 08:09 WongKinYiu

I think it really great, just I am not familiar to git merge PR, so wait for Alexey to check it.

Very appreciated for your reply，looking forward to Alexey's suggestions. @AlexeyAB

Sep 09 '22 03:09 leiqing1

Hi, amazing work! However I tried experimenting myself and didn't manage to make it work (different problems different times, such as in paddle there is a class that expects 2 parameters and only 1 is given, probably a default value removed, and so on), and, when I managed to make it work it was only the version without the End2End class attached to it, when converting to ONNX. so could you: -modify/update the tutorial in light of these problems

could you specify if its only possible to run it with no End2End class attached, or how to have it work when the onnx has that class integrated? Eager to test the results with the End2End class!

Oct 13 '22 10:10 drasgo

Hi, is it possible to get quantized yolov7 tiny model?

Oct 22 '22 08:10 Bombex

I was not able to reproduce the results running into the errors

2023-02-16 18:10:37,588-INFO: Now translating model from onnx to paddle.
2023-02-16 18:10:37,588-WARNING: __init__() missing 2 required positional arguments: 'input_shape_dict' and 'enable_onnx_checker'
2023-02-16 18:10:37,588-ERROR: x2paddle threw an exception, you can ask for help at: https://github.com/PaddlePaddle/X2Paddle/issues

@yghstill You can easily reproduce this with this dockerfile

DOCKERFILE

FROM paddlepaddle/paddle:2.3.2-gpu-cuda11.2-cudnn8

ARG DEBIAN_FRONTEND=noninteractive
RUN apt update && apt upgrade -y
RUN apt-get update && apt-get install ffmpeg libsm6 libxext6  -y

COPY calib_images /workspace/calib_images
COPY quantize.py /workspace/quantize.py  # basically a script version of your ipynb

WORKDIR /workspace
RUN wget https://paddle-slim-models.bj.bcebos.com/act/yolov7-tiny.onnx
RUN python3 -m pip install opencv-python paddleslim==2.3.4  # not part of default docker container.

ENTRYPOINT python3 /workspace/quantize.py

Update:

So what I found was that the error is in the onnx-decoder in /usr/local/lib/python3.7/dist-packages/x2paddle/decoder/onnx_decoder.py, more precisely in the class

class ONNXDecoder(object):
    def __init__(self, onnx_model, input_shape_dict, enable_onnx_checker):

One can replace this with

class ONNXDecoder(object):
    def __init__(self, onnx_model, input_shape_dict=None, enable_onnx_checker=False):

and it runs through as described in the python-file. One can script this with sed like this sed -i 's/def __init__(self, onnx_model, input_shape_dict, enable_onnx_checker)/def __init__(self, onnx_model, input_shape_dict=None, enable_onnx_checker=False)/' /usr/local/lib/python3.7/dist-packages/x2paddle/decoder/onnx_decoder.py. This description should be merged to the main branch IMHO.

Nevertheless, I cannot save the engine once it was converted and ran the inference, it just breaks with the error message with YOUR yolov7-tiny.onnx The command I used was (on docker image nvcr.io/nvidia/tensorrt:22.03-py3)

trtexec --onnx=/workspace/optimization/output/yolov7-tiny+x2paddle/ONNX/quant_model.onnx \
  --workspace=1024 \
  --calib=/workspace/optimization/output/yolov7-tiny+x2paddle/ONNX/calibration.cache \
  --int8 \
  --verbose \
  --saveEngine=/workspace/yolov7+x2paddle/ONNX/yolo_quantized.plan

[02/17/2023-15:56:00] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +7, now: CPU 0, GPU 7 (MiB)
[02/17/2023-15:56:00] [E] Saving engine to file failed.
[02/17/2023-15:56:00] [E] Engine set up failed

@yghstill can you provide maybe a Dockerfile that runs+calibrates a model to int8 and creates an trt-engine of it?

Feb 16 '23 18:02 dnns92

@yghstill I have just try this work, it is interesting. Now I want convert quantized model from paddle to pytorch and onnx so how to do that?

Oct 24 '23 07:10 viethoang303

yolov7 yolov7 copied to clipboard

Add YOLOv7 quantization example

yolov7
yolov7 copied to clipboard