onnxruntime
onnxruntime copied to clipboard
Can onnxruntime.quantization.quantize_dynamic() work with onnx-trt?
Describe the issue
Hi,
I'd like to try quantize_dynamic() on our models, but I notice that it will insert DynamicQuantizeLinear into the graph. And onnx-trt doesn't support it yet. Are there any onnx official TRT plugin to support it? Or is there any other workaround for it?
Thanks,
To reproduce
N/A
Urgency
No response
Platform
Linux
OS Version
20.04
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.17.0
ONNX Runtime API
Python
Architecture
X86
Execution Provider
CUDA
Execution Provider Library Version
CUDA 12.2