TensorRT Does TensorRT have plugin for onnx's DynamicQuantizeLinear?

Description

I used onnxruntime.quantization.quantize_dynamic to quantize my model, which inserted a bunch of DynamicQuantizeLinear into the graph. When I latter use TensorRT Python API to compile it, it says [06/24/2024-22:46:00] [TRT] [E] 3: getPluginCreator could not find plugin: DynamicQuantizeLinear version: 1.

Is there existing plugin for DynamicQuantizeLinear?

Environment

TensorRT Version: 8.6.1.6

NVIDIA GPU: A4000

NVIDIA Driver Version: 535.183.01

CUDA Version: 12.2

CUDNN Version: N/A

Operating System:

Python Version (if applicable):

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

Relevant Files

Model link:

Steps To Reproduce

Commands or scripts:

Have you tried the latest release?:

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

Jun 25 '24 00:06 myunuro

No , I think. @myunuro You need use tensorrt pytorch_quantization toolkit to qat.

Jun 25 '24 06:06 lix19937

Thanks for the quick reply! Is there any equivalent for Tensorflow model? Basically we are thinking of onnx as a converge point for both Pytorch and Tensorflow

Jun 25 '24 17:06 myunuro

Ref https://github.com/NVIDIA/TensorRT/tree/release/10.1/tools/tensorflow-quantization @myunuro

Jun 26 '24 01:06 lix19937

closing since there is already answer, thanks all!

Aug 07 '24 05:08 ttyio

Hi. I would like to ask you to reopen this, because I can't get my head around how tensorflow-quantization referenced here:

Ref https://github.com/NVIDIA/TensorRT/tree/release/10.1/tools/tensorflow-quantization @myunuro

can be a follow up on this:

No , I think. @myunuro You need use tensorrt pytorch_quantization toolkit to qat.

I am using onnxruntime in pytorch to do post-training quant, and face the same issue as OP.

Aug 12 '24 14:08 shahdloo