nncf Concat scales not being grouped

I am trying to quantize a pytorch model using NNCF. The output of my model is a concatenation of two tensors.

To quantize my outputs I set: advanced_parameters = AdvancedQuantizationParameters(quantize_outputs=True)

When I quantize the model I get a separate quantizer for each input:

ModuleDict(
  (/nncf_model_input_0|OUTPUT): AsymmetricQuantizer(bit=8, ch=False)
  (/nncf_model_input_1|OUTPUT): AsymmetricQuantizer(bit=8, ch=False)
)

Based on what I saw in NNCF I would expect to get something like this.

ModuleDict(
  (/nncf_model_input_0|OUTPUT;/nncf_model_input_1|OUTPUT): AsymmetricQuantizer(bit=8, ch=False)
)

I am guessing it's an edge case which comes up due to AdvancedQuantizationParameters.

NNCF version: 2.6.0

Run the following to reproduce:

import nncf
import torch
import numpy as np
from nncf.quantization.advanced_parameters import AdvancedQuantizationParameters

class DummyDataset(torch.utils.data.Dataset):
    """ Loads images from a folder """
    def __init__(self, input_shapes, input_names):
        self.input_shapes = input_shapes
        self.input_names = input_names

    def __len__(self):
        return 1

    def __getitem__(self, index):
        return { self.input_names[i]: np.random.rand(*input_shape) for i, input_shape in enumerate(self.input_shapes) }

class DummyModel(torch.nn.Module):

    def __init__(self):
        super().__init__()
        self.conv = torch.nn.Conv2d(6, 6, 3, 1, 1)
        
    def forward(self, x, y):
        x_cat_y = torch.cat((x,y), dim=1)
        
        return x_cat_y
        # return self.conv(x_cat_y) # use this to verify that quantizers get grouped if concat isn't the output
    
def quantize_model(model, input_shapes, input_names):
    def transform_fn(data):
        data_dict = data
        return tuple(data_dict[key][0].to(torch.float32) for key in data_dict)
    
    dummy_dl = torch.utils.data.DataLoader(DummyDataset(input_shapes, input_names))
    calibration_dataset = nncf.Dataset(dummy_dl, transform_fn)
    
    advanced_parameters = AdvancedQuantizationParameters(quantize_outputs=True)
    
    return nncf.quantize(model, calibration_dataset, subset_size=1, preset=nncf.QuantizationPreset.MIXED, advanced_parameters=advanced_parameters)

def main():
    quantized_model = quantize_model(DummyModel(), [(1, 3, 256, 256), (1, 3, 256, 256)], ["x", "y"])
    
    print()
    print(quantized_model._nncf.external_quantizers)
    
if __name__ == "__main__":
    main()

Oct 13 '23 09:10 basioli-k

Greetings, @basioli-k! Thanks for spotting this, and for the detailed reproducer that makes debugging this a breeze.

The unexpected behaviour seems to be due to some logic introduced in https://github.com/openvinotoolkit/nncf/pull/1778. If I comment the following lines: https://github.com/openvinotoolkit/nncf/blob/4d47869225868cea31707e90c4490dbcb2a388fd/nncf/common/quantization/quantizer_propagation/graph.py#L831-L832 the input quantizers in both your cases get unified. We added that logic (only unifying concat scales if concat is followed by a weighted op) in response to low PTQ accuracy in densenet and inception, but IMO the concat input quantizers in the per-tensor case should be unified regardless of the ops that follow the concat. Will investigate how to best fix this on the develop branch.

Oct 13 '23 12:10 vshampor

Thank you for the response. Looking forward to the fix :smiley:

Oct 13 '23 13:10 basioli-k

Ref. 138683

Apr 16 '24 16:04 avitial

nncf nncf copied to clipboard

Concat scales not being grouped

nncf
nncf copied to clipboard