TensorRT
                                
                                 TensorRT copied to clipboard
                                
                                    TensorRT copied to clipboard
                            
                            
                            
                        Are torch.nn.functional methods automatically quantized by pytorch-quantization?
Description
I am trying to quantize a Pytorch model to INT8 to run with tensrorrt. I have read these docs, and am still unclear on whether I have to make custom quantization implementations for torch.nn.functional methods, namely F.Conv2d(), F.ReLU(), F.max_pool2d(), F.interpolate(). I use both of these during inference and am concerned they are computed in FP32.
The forward-pass looks like this:
def forward(self, x)
        conv_output = self.conv_features(x) # nn.Conv2d, n..ReLU, nn.BN, nn.MaxPool
        distances = self._l2_convolution(conv_output)  # F.Conv2d, F.ReLU
        similarities = self.distance_2_similarity(distances) # torch.log()
        
        min_similarities = -F.max_pool2d(-similarities,
                                         kernel_size=(similarities.size()[2],
                                                      similarities.size()[3])) 
        
        sim_score = min_similarities.view(-1, self.num_prototypes)
        upsampled_activation_pattern = F.interpolate(similarities, size=self.img_size, mode='bicubic')
        logits = self.last_layer(sim_score) # Linear
        return logits, sim_score, upsampled_activation_pattern
Following this example I perform quantization:
from pytorch_quantization import nn as quant_nn
from pytorch_quantization import calib
from pytorch_quantization.tensor_quant import QuantDescriptor
from pytorch_quantization import quant_modules
quant_desc_input = QuantDescriptor(calib_method='histogram')
quant_nn.QuantConv2d.set_default_quant_desc_input(quant_desc_input)
quant_nn.QuantLinear.set_default_quant_desc_input(quant_desc_input)
quant_nn.QuantMaxPool2d.set_default_quant_desc_input(quant_desc_input)
This is the command I use to convert to ONNX:
torch.onnx.export(model, dummy_input, onnx_filename, verbose=False, opset_version=13, do_constant_folding=True)
This is the command I use to convert to a TensortRT engine:
trtexec --onnx=<file_name> --saveEngine=<file_name> --explicitBatch --int8
Thank you.
add quant_modules.initialize()
I have that in my code below -does adding that quantize the functional methods? It definitely is quantizing nn.modules correctly.
Please check our sample(https://github.com/NVIDIA/TensorRT/tree/release/8.6/tools/pytorch-quantization/examples) and documentation.
Like https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/index.html#document-tutorials/creating_custom_quantized_modules
closing since no activity for more than 3 weeks, pls reopen if you still have question, thanks!