TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

Are torch.nn.functional methods automatically quantized by pytorch-quantization?

Open AndyWanna opened this issue 1 year ago • 4 comments

Description

I am trying to quantize a Pytorch model to INT8 to run with tensrorrt. I have read these docs, and am still unclear on whether I have to make custom quantization implementations for torch.nn.functional methods, namely F.Conv2d(), F.ReLU(), F.max_pool2d(), F.interpolate(). I use both of these during inference and am concerned they are computed in FP32.

The forward-pass looks like this:

def forward(self, x)

        conv_output = self.conv_features(x) # nn.Conv2d, n..ReLU, nn.BN, nn.MaxPool
        distances = self._l2_convolution(conv_output)  # F.Conv2d, F.ReLU
        similarities = self.distance_2_similarity(distances) # torch.log()
        
        min_similarities = -F.max_pool2d(-similarities,
                                         kernel_size=(similarities.size()[2],
                                                      similarities.size()[3])) 
        
        sim_score = min_similarities.view(-1, self.num_prototypes)

        upsampled_activation_pattern = F.interpolate(similarities, size=self.img_size, mode='bicubic')

        logits = self.last_layer(sim_score) # Linear
        return logits, sim_score, upsampled_activation_pattern

Following this example I perform quantization:

from pytorch_quantization import nn as quant_nn
from pytorch_quantization import calib
from pytorch_quantization.tensor_quant import QuantDescriptor
from pytorch_quantization import quant_modules

quant_desc_input = QuantDescriptor(calib_method='histogram')
quant_nn.QuantConv2d.set_default_quant_desc_input(quant_desc_input)
quant_nn.QuantLinear.set_default_quant_desc_input(quant_desc_input)
quant_nn.QuantMaxPool2d.set_default_quant_desc_input(quant_desc_input)

This is the command I use to convert to ONNX: torch.onnx.export(model, dummy_input, onnx_filename, verbose=False, opset_version=13, do_constant_folding=True)

This is the command I use to convert to a TensortRT engine: trtexec --onnx=<file_name> --saveEngine=<file_name> --explicitBatch --int8

Thank you.

AndyWanna avatar Feb 26 '24 20:02 AndyWanna

add quant_modules.initialize()

Data-Iab avatar Feb 27 '24 13:02 Data-Iab

I have that in my code below -does adding that quantize the functional methods? It definitely is quantizing nn.modules correctly.

AndyWanna avatar Feb 27 '24 13:02 AndyWanna

Please check our sample(https://github.com/NVIDIA/TensorRT/tree/release/8.6/tools/pytorch-quantization/examples) and documentation.

zerollzeng avatar Mar 01 '24 06:03 zerollzeng

Like https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/index.html#document-tutorials/creating_custom_quantized_modules

zerollzeng avatar Mar 01 '24 06:03 zerollzeng

closing since no activity for more than 3 weeks, pls reopen if you still have question, thanks!

ttyio avatar Mar 26 '24 17:03 ttyio