TensorRT ✨[Feature] Is there a plan to support to convert quantized PT2 to trt ?

✨[Feature] Is there a plan to support to convert quantized PT2 to trt ?

Open MaltoseFlower opened this issue 7 months ago • 3 comments

when i use torch-tensorrt 2.4.0 to convert a quantized PT2 to trt, i got this error blow.

I wonder whether it will support in ther future? Or, i am able to do this based on the current version (maybe through converting the torch.ops.quantized_decomposed.dequantize_per_tensor.default operator to other quantization operator? ).

Apr 14 '25 02:04 MaltoseFlower

@lanluo-nvidia Can you take a look at this post FP4?

Apr 29 '25 17:04 narendasan

@MaltoseFlower do you have any example code which I can look into this further?

May 20 '25 21:05 lanluo-nvidia

meanwhile here is an example of the fp8/int8 PTQ for your reference: https://github.com/pytorch/TensorRT/blob/main/examples/dynamo/vgg16_ptq.py

May 26 '25 02:05 lanluo-nvidia

TensorRT TensorRT copied to clipboard

✨[Feature] Is there a plan to support to convert quantized PT2 to trt ?

TensorRT
TensorRT copied to clipboard