yolov7
yolov7 copied to clipboard
Add YOLOv7 quantization example
We used PaddleSlim ACT(Auto Compression Toolkit) to quantize and compress YOLOv7, and the results on T4 are as follows:
Model | Base mAPval 0.5:0.95 |
Quant mAPval 0.5:0.95 |
LatencyFP32 |
LatencyFP16 |
LatencyINT8 |
---|---|---|---|---|---|
YOLOv7 | 51.2 | 50.9 | 26.84ms | 7.44ms | 4.55ms |
YOLOv7-Tiny | 37.3 | 37.0 | 5.06ms | 2.32ms | 1.68ms |
Please @WongKinYiu @AlexeyAB review, thx.
@nemonameless @HeungJunKim @Errol-golang @NicholasZolton @HeungJunKim @Errol-golang @akashAD98 @adujardin @eshoyuan @UNeedCryDear hi,this pr is about Source-Free compression training on YOLOv7,Just interested can try it!
@yghstill Hi. Can you guide me about the compression of yolo-w6-pose
model?
@yghstill Hi. Can you guide me about the compression of
yolo-w6-pose
model?
@pytholic I will try to compress yolov7-w6-pose
.
@yghstill Hi. Can you guide me about the compression of
yolo-w6-pose
model?@pytholic I will try to compress
yolov7-w6-pose
.
@yghstill Thank you, will look forward to it!
@yghstill I was trying to do it myself, but turns out that Paddle does not support Ubuntu 22.04 at the moment.
hi @pytholic Paddle support Ubuntu22.04 in develop branch. you can install paddlepaddle referring to this link (https://www.paddlepaddle.org.cn/en) e.g.: python -m pip install paddlepaddle-gpu==0.0.0.post112 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html
@WongKinYiu hi KinYiu , do you have any suggestions on this PR of automatic compression technology on YOLOv7? https://github.com/WongKinYiu/yolov7/pull/612
I think it really great, just I am not familiar to git merge PR, so wait for Alexey to check it.
I think it really great, just I am not familiar to git merge PR, so wait for Alexey to check it.
Very appreciated for your reply,looking forward to Alexey's suggestions. @AlexeyAB
Hi, amazing work! However I tried experimenting myself and didn't manage to make it work (different problems different times, such as in paddle there is a class that expects 2 parameters and only 1 is given, probably a default value removed, and so on), and, when I managed to make it work it was only the version without the End2End class attached to it, when converting to ONNX. so could you: -modify/update the tutorial in light of these problems
- could you specify if its only possible to run it with no End2End class attached, or how to have it work when the onnx has that class integrated? Eager to test the results with the End2End class!
Hi, is it possible to get quantized yolov7 tiny model?
I was not able to reproduce the results running into the errors
2023-02-16 18:10:37,588-INFO: Now translating model from onnx to paddle.
2023-02-16 18:10:37,588-WARNING: __init__() missing 2 required positional arguments: 'input_shape_dict' and 'enable_onnx_checker'
2023-02-16 18:10:37,588-ERROR: x2paddle threw an exception, you can ask for help at: https://github.com/PaddlePaddle/X2Paddle/issues
@yghstill You can easily reproduce this with this dockerfile
DOCKERFILE
FROM paddlepaddle/paddle:2.3.2-gpu-cuda11.2-cudnn8
ARG DEBIAN_FRONTEND=noninteractive
RUN apt update && apt upgrade -y
RUN apt-get update && apt-get install ffmpeg libsm6 libxext6 -y
COPY calib_images /workspace/calib_images
COPY quantize.py /workspace/quantize.py # basically a script version of your ipynb
WORKDIR /workspace
RUN wget https://paddle-slim-models.bj.bcebos.com/act/yolov7-tiny.onnx
RUN python3 -m pip install opencv-python paddleslim==2.3.4 # not part of default docker container.
ENTRYPOINT python3 /workspace/quantize.py
Update:
So what I found was that the error is in the onnx-decoder in /usr/local/lib/python3.7/dist-packages/x2paddle/decoder/onnx_decoder.py
, more precisely in the class
class ONNXDecoder(object):
def __init__(self, onnx_model, input_shape_dict, enable_onnx_checker):
One can replace this with
class ONNXDecoder(object):
def __init__(self, onnx_model, input_shape_dict=None, enable_onnx_checker=False):
and it runs through as described in the python-file. One can script this with sed like this sed -i 's/def __init__(self, onnx_model, input_shape_dict, enable_onnx_checker)/def __init__(self, onnx_model, input_shape_dict=None, enable_onnx_checker=False)/' /usr/local/lib/python3.7/dist-packages/x2paddle/decoder/onnx_decoder.py
. This description should be merged to the main branch IMHO.
Nevertheless, I cannot save the engine once it was converted and ran the inference, it just breaks with the error message with YOUR yolov7-tiny.onnx The command I used was (on docker image nvcr.io/nvidia/tensorrt:22.03-py3)
trtexec --onnx=/workspace/optimization/output/yolov7-tiny+x2paddle/ONNX/quant_model.onnx \
--workspace=1024 \
--calib=/workspace/optimization/output/yolov7-tiny+x2paddle/ONNX/calibration.cache \
--int8 \
--verbose \
--saveEngine=/workspace/yolov7+x2paddle/ONNX/yolo_quantized.plan
[02/17/2023-15:56:00] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +7, now: CPU 0, GPU 7 (MiB)
[02/17/2023-15:56:00] [E] Saving engine to file failed.
[02/17/2023-15:56:00] [E] Engine set up failed
@yghstill can you provide maybe a Dockerfile that runs+calibrates a model to int8 and creates an trt-engine of it?
@yghstill I have just try this work, it is interesting. Now I want convert quantized model from paddle to pytorch and onnx so how to do that?