TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

some stange error when using pytorch_quantization for googlenet

Open Dsqds opened this issue 3 years ago • 5 comments

Description

when transform googlenet changed by pytorch_quantization from onnx to engine, it reported the error below. I want to ask if there are some experience or some tools can help?

TensorRT ONNX parser error: In node 185 (QuantDequantLinearHelper): INVALID_NODE: Assertion failed: scaleAllPositive && "Scale coefficients must all be positive" 3-onnx2trtint8.py:25: DeprecationWarning: Use build_serialized_network instead. engine = builder.build_engine(network, config=config) [08/23/2022-05:58:53] [TRT] [E] 4: [network.cpp::validate::2671] Error Code 4: Internal Error (Network must have at least one output) Traceback (most recent call last): File "3-onnx2trtint8.py", line 39, in f.write(engine.serialize()) AttributeError: 'NoneType' object has no attribute 'serialize'

Environment

docker-image: nvcr.io/nvidia/tensorrt:22.07-py3 TensorRT Version: 8.4.1 NVIDIA GPU: GeForce RTX 2060 SUPER NVIDIA Driver Version: 460.91.03 CUDA Version: container: 11.7 CUDNN Version: 8.4.1 Operating System: ubuntu Python Version (if applicable): 3.8.10 Tensorflow Version (if applicable): PyTorch Version (if applicable): 1.11.0 Baremetal or Container (if so, version):

Relevant Files

Below is my code, use the "master" branch: https://github.com/Dsqds/pytorch-cifar100 master

Below is the pth model and onnx model: https://drive.google.com/file/d/1MKD_5Wcp2URufLScVEr2yk-cRAqdJlRg/view?usp=sharing https://drive.google.com/file/d/1W9k9Qp88sVEJgGb9022meIGoddxo50O0/view?usp=sharing

Steps To Reproduce

Using the docker-image above, the onnx model i provided and the 3-onnx2trtint8.py in my code(change the file path) , then it will be reproduced.

Dsqds avatar Aug 23 '22 06:08 Dsqds

the reason why it fails here is because it has a zero scale image

zerollzeng avatar Aug 23 '22 09:08 zerollzeng

I would guess it's a bug in our pytorch-quantization tool. zero or negative scale should never generated. cc @netaz @ttyio

zerollzeng avatar Aug 23 '22 09:08 zerollzeng

@Dsqds how did you generate the onnx, is it calibrated? thanks!

ttyio avatar Aug 24 '22 01:08 ttyio

@Dsqds how did you generate the onnx, is it calibrated? thanks!

i use the code 3-pytorch_quantization2onnx.py in my code to generate the onnx from the pth file i provided, i think it's calibrated, and the googlenet_quantization model structure is in the model/googlenet_quantization.py, the pth file is generated from the code 1-train.py and the train process only uses cifar100 and only takes about 1 hour.

Dsqds avatar Aug 24 '22 03:08 Dsqds

@Dsqds , maybe I miss something here, I was checking the code https://github.com/Dsqds/pytorch-cifar100/blob/master/3-pytorch_quantization2onnx.py, it is not there.

could you follow https://github.com/NVIDIA/TensorRT/blob/main/tools/pytorch-quantization/examples/torchvision/classification_flow.py#L357

In the code we call enable_calib before run the calibration batch, and call load_calib_amax to do the calculation. there are more details in this tutorials https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/tutorials/quant_resnet50.html, hope it helps, thanks!

ttyio avatar Aug 24 '22 03:08 ttyio

Closing since no activity for more than 3 weeks, please reopen if you still have question, thanks!

ttyio avatar Nov 01 '22 02:11 ttyio

怎么解决

jamh00 avatar Sep 14 '23 14:09 jamh00

how to solve

jamh00 avatar Sep 15 '23 08:09 jamh00