yolov7 icon indicating copy to clipboard operation
yolov7 copied to clipboard

Add YOLOv7 quantization example

Open yghstill opened this issue 2 years ago • 9 comments

We used PaddleSlim ACT(Auto Compression Toolkit) to quantize and compress YOLOv7, and the results on T4 are as follows:

Model Base mAPval
0.5:0.95
Quant mAPval
0.5:0.95
LatencyFP32
LatencyFP16
LatencyINT8
YOLOv7 51.2 50.9 26.84ms 7.44ms 4.55ms
YOLOv7-Tiny 37.3 37.0 5.06ms 2.32ms 1.68ms

Please @WongKinYiu @AlexeyAB review, thx.

yghstill avatar Aug 24 '22 09:08 yghstill

@nemonameless @HeungJunKim @Errol-golang @NicholasZolton @HeungJunKim @Errol-golang @akashAD98 @adujardin @eshoyuan @UNeedCryDear hi,this pr is about Source-Free compression training on YOLOv7,Just interested can try it!

leiqing1 avatar Sep 02 '22 15:09 leiqing1

@yghstill Hi. Can you guide me about the compression of yolo-w6-pose model?

pytholic avatar Sep 07 '22 00:09 pytholic

@yghstill Hi. Can you guide me about the compression of yolo-w6-pose model?

@pytholic I will try to compress yolov7-w6-pose.

yghstill avatar Sep 07 '22 03:09 yghstill

@yghstill Hi. Can you guide me about the compression of yolo-w6-pose model?

@pytholic I will try to compress yolov7-w6-pose.

@yghstill Thank you, will look forward to it!

pytholic avatar Sep 07 '22 04:09 pytholic

@yghstill I was trying to do it myself, but turns out that Paddle does not support Ubuntu 22.04 at the moment.

pytholic avatar Sep 07 '22 06:09 pytholic

hi @pytholic Paddle support Ubuntu22.04 in develop branch. you can install paddlepaddle referring to this link (https://www.paddlepaddle.org.cn/en) e.g.: python -m pip install paddlepaddle-gpu==0.0.0.post112 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html

leiqing1 avatar Sep 07 '22 09:09 leiqing1

@WongKinYiu hi KinYiu , do you have any suggestions on this PR of automatic compression technology on YOLOv7? https://github.com/WongKinYiu/yolov7/pull/612

leiqing1 avatar Sep 08 '22 05:09 leiqing1

I think it really great, just I am not familiar to git merge PR, so wait for Alexey to check it.

WongKinYiu avatar Sep 08 '22 08:09 WongKinYiu

I think it really great, just I am not familiar to git merge PR, so wait for Alexey to check it.

Very appreciated for your reply,looking forward to Alexey's suggestions. @AlexeyAB

leiqing1 avatar Sep 09 '22 03:09 leiqing1

Hi, amazing work! However I tried experimenting myself and didn't manage to make it work (different problems different times, such as in paddle there is a class that expects 2 parameters and only 1 is given, probably a default value removed, and so on), and, when I managed to make it work it was only the version without the End2End class attached to it, when converting to ONNX. so could you: -modify/update the tutorial in light of these problems

  • could you specify if its only possible to run it with no End2End class attached, or how to have it work when the onnx has that class integrated? Eager to test the results with the End2End class!

drasgo avatar Oct 13 '22 10:10 drasgo

Hi, is it possible to get quantized yolov7 tiny model?

Bombex avatar Oct 22 '22 08:10 Bombex

I was not able to reproduce the results running into the errors

2023-02-16 18:10:37,588-INFO: Now translating model from onnx to paddle.
2023-02-16 18:10:37,588-WARNING: __init__() missing 2 required positional arguments: 'input_shape_dict' and 'enable_onnx_checker'
2023-02-16 18:10:37,588-ERROR: x2paddle threw an exception, you can ask for help at: https://github.com/PaddlePaddle/X2Paddle/issues

@yghstill You can easily reproduce this with this dockerfile

DOCKERFILE

FROM paddlepaddle/paddle:2.3.2-gpu-cuda11.2-cudnn8

ARG DEBIAN_FRONTEND=noninteractive
RUN apt update && apt upgrade -y
RUN apt-get update && apt-get install ffmpeg libsm6 libxext6  -y

COPY calib_images /workspace/calib_images
COPY quantize.py /workspace/quantize.py  # basically a script version of your ipynb

WORKDIR /workspace
RUN wget https://paddle-slim-models.bj.bcebos.com/act/yolov7-tiny.onnx
RUN python3 -m pip install opencv-python paddleslim==2.3.4  # not part of default docker container.

ENTRYPOINT python3 /workspace/quantize.py

Update:

So what I found was that the error is in the onnx-decoder in /usr/local/lib/python3.7/dist-packages/x2paddle/decoder/onnx_decoder.py, more precisely in the class

class ONNXDecoder(object):
    def __init__(self, onnx_model, input_shape_dict, enable_onnx_checker):

One can replace this with

class ONNXDecoder(object):
    def __init__(self, onnx_model, input_shape_dict=None, enable_onnx_checker=False):

and it runs through as described in the python-file. One can script this with sed like this sed -i 's/def __init__(self, onnx_model, input_shape_dict, enable_onnx_checker)/def __init__(self, onnx_model, input_shape_dict=None, enable_onnx_checker=False)/' /usr/local/lib/python3.7/dist-packages/x2paddle/decoder/onnx_decoder.py. This description should be merged to the main branch IMHO.

Nevertheless, I cannot save the engine once it was converted and ran the inference, it just breaks with the error message with YOUR yolov7-tiny.onnx The command I used was (on docker image nvcr.io/nvidia/tensorrt:22.03-py3)

trtexec --onnx=/workspace/optimization/output/yolov7-tiny+x2paddle/ONNX/quant_model.onnx \
  --workspace=1024 \
  --calib=/workspace/optimization/output/yolov7-tiny+x2paddle/ONNX/calibration.cache \
  --int8 \
  --verbose \
  --saveEngine=/workspace/yolov7+x2paddle/ONNX/yolo_quantized.plan

[02/17/2023-15:56:00] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +7, now: CPU 0, GPU 7 (MiB)
[02/17/2023-15:56:00] [E] Saving engine to file failed.
[02/17/2023-15:56:00] [E] Engine set up failed

@yghstill can you provide maybe a Dockerfile that runs+calibrates a model to int8 and creates an trt-engine of it?

dnns92 avatar Feb 16 '23 18:02 dnns92

@yghstill I have just try this work, it is interesting. Now I want convert quantized model from paddle to pytorch and onnx so how to do that?

viethoang303 avatar Oct 24 '23 07:10 viethoang303