TensorRT
TensorRT copied to clipboard
✨[Feature] Is there a plan to support to convert quantized PT2 to trt ?
when i use torch-tensorrt 2.4.0 to convert a quantized PT2 to trt, i got this error blow.
I wonder whether it will support in ther future? Or, i am able to do this based on the current version (maybe through converting the torch.ops.quantized_decomposed.dequantize_per_tensor.default operator to other quantization operator? ).
@lanluo-nvidia Can you take a look at this post FP4?
@MaltoseFlower do you have any example code which I can look into this further?
meanwhile here is an example of the fp8/int8 PTQ for your reference: https://github.com/pytorch/TensorRT/blob/main/examples/dynamo/vgg16_ptq.py