Does TensorRT have plugin for onnx's DynamicQuantizeLinear?
Description
I used onnxruntime.quantization.quantize_dynamic to quantize my model, which inserted a bunch of DynamicQuantizeLinear into the graph. When I latter use TensorRT Python API to compile it, it says [06/24/2024-22:46:00] [TRT] [E] 3: getPluginCreator could not find plugin: DynamicQuantizeLinear version: 1.
Is there existing plugin for DynamicQuantizeLinear?
Environment
TensorRT Version: 8.6.1.6
NVIDIA GPU: A4000
NVIDIA Driver Version: 535.183.01
CUDA Version: 12.2
CUDNN Version: N/A
Operating System:
Python Version (if applicable):
Tensorflow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if so, version):
Relevant Files
Model link:
Steps To Reproduce
Commands or scripts:
Have you tried the latest release?:
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):
No , I think. @myunuro You need use tensorrt pytorch_quantization toolkit to qat.
Thanks for the quick reply! Is there any equivalent for Tensorflow model? Basically we are thinking of onnx as a converge point for both Pytorch and Tensorflow
Ref https://github.com/NVIDIA/TensorRT/tree/release/10.1/tools/tensorflow-quantization @myunuro
closing since there is already answer, thanks all!
Hi. I would like to ask you to reopen this, because I can't get my head around how tensorflow-quantization referenced here:
Ref https://github.com/NVIDIA/TensorRT/tree/release/10.1/tools/tensorflow-quantization @myunuro
can be a follow up on this:
No , I think. @myunuro You need use tensorrt pytorch_quantization toolkit to qat.
I am using onnxruntime in pytorch to do post-training quant, and face the same issue as OP.