nncf icon indicating copy to clipboard operation
nncf copied to clipboard

Keeps Searching node when using nncf.IgnoredScope

Open qiuzhewei opened this issue 1 month ago • 3 comments

🐛 Describe the bug

Hi, I am trying to quantize my model using PTQ and apply nncf.IgnoredScope to gain some performance referring to yolov11-object-detection.ipynb. The problem is, for some reason, if I specify relatively "adjacent" inputs and outputs, say /backbone/blocks.5/blocks.5.8/Add and /backbone/blocks.5/blocks.5.10/Add, the searching procudure is fast, whereas for inputs and outputs far apart, /backbone/blocks.5/blocks.5.8/Add and /backbone/blocks.5/blocks.5.17/Add, for example, it keeps searching for days. My convert script:

import onnx
import torch
from torchvision import datasets
from torchvision import transforms

import nncf


val_dataset = datasets.ImageFolder(
    root = "./model_compression/calib_data/template_feature",
    transform = transforms.Compose([
    transforms.Resize([512,512]),
    transforms.ToTensor(),
    transforms.Normalize(mean = (0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
    ])
)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=1, shuffle=False)

onnx_path = "./model_compression/models/template_feature/onnx/isc_ft_v107.onnx"
model = onnx_model = onnx.load(onnx_path)
input_name = onnx_model.graph.input[0].name

def transform_fn(data_item):
    images, _ = data_item
    return {input_name: images.numpy()}

ignored_scope = nncf.IgnoredScope(
        subgraphs=[
            nncf.Subgraph(inputs=["/backbone/blocks.5/blocks.5.8/Add",
                                  ],
                        outputs=["/backbone/blocks.5/blocks.5.15/Add"]),
            
            nncf.Subgraph(inputs=["/GlobalAveragePool", #/backbone/blocks.5/blocks.5.10/Add
                                  "/Shape"],
                        outputs=["/Div"]),
            ]
        )

calibration_dataset = nncf.Dataset(val_loader, transform_fn)
onnx_quantized_model = nncf.quantize(model,
                                     calibration_dataset,
                                     subset_size=200,
                                     preset=nncf.QuantizationPreset.MIXED,
                                     ignored_scope=ignored_scope,
                                     )


int8_model_path = "models/template_feature/onnx/model_int8_200_ignore_508_515.onnx"
onnx.save(onnx_quantized_model, int8_model_path)

The model is vit-base from https://github.com/lucidrains/vit-pytorch If you need other infomation, please let me know. Thanks in advance!!

Environment

requirements.txt

PRETTY_NAME="Ubuntu 24.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.1 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo

Minimal Reproducible Example

No response

Are you going to submit a PR?

  • [x] Yes I'd like to help by submitting a PR!

qiuzhewei avatar Nov 27 '25 06:11 qiuzhewei

@andrey-churkin , please take a look

MaximProshin avatar Nov 27 '25 08:11 MaximProshin

@qiuzhewei Hi, thanks for reporting the issues. May I ask if you know which part of your model should be ignored (i.e., excluded from quantization)?

Could you please also provide the script showing how the isc_ft_v107.onnx model was obtained?

andrey-churkin avatar Nov 27 '25 12:11 andrey-churkin

@andrey-churkin There is not a specific part of model that I want to quantize. I am trying to do some experiments on minimizing the accuracy drop that PTQ introduces. I mannually specify different parts of models unitl the accuracy drops is acceptable. So basically this is a trade-off between accuracy and parts of model quantized. I also tried quantization_with_accuracy_control, but again, it was super slow, took two days but still not converage. So my purpose is to start from a 'heavy' quantized model and then gradually to a 'light' quantized until I reach the boundary that introduces minimal accuracy drop.

The model is obtained from https://github.com/lyakaap/ISC21-Descriptor-Track-1st/blob/master/isc_feature_extractor/model.py, isc_ft_v107.onnx is the onnx version of pretrained model in line 11.

qiuzhewei avatar Nov 28 '25 03:11 qiuzhewei