Keeps Searching node when using nncf.IgnoredScope
🐛 Describe the bug
Hi,
I am trying to quantize my model using PTQ and apply nncf.IgnoredScope to gain some performance referring to yolov11-object-detection.ipynb. The problem is, for some reason, if I specify relatively "adjacent" inputs and outputs, say /backbone/blocks.5/blocks.5.8/Add and /backbone/blocks.5/blocks.5.10/Add, the searching procudure is fast, whereas for inputs and outputs far apart, /backbone/blocks.5/blocks.5.8/Add and /backbone/blocks.5/blocks.5.17/Add, for example, it keeps searching for days.
My convert script:
import onnx
import torch
from torchvision import datasets
from torchvision import transforms
import nncf
val_dataset = datasets.ImageFolder(
root = "./model_compression/calib_data/template_feature",
transform = transforms.Compose([
transforms.Resize([512,512]),
transforms.ToTensor(),
transforms.Normalize(mean = (0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
])
)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=1, shuffle=False)
onnx_path = "./model_compression/models/template_feature/onnx/isc_ft_v107.onnx"
model = onnx_model = onnx.load(onnx_path)
input_name = onnx_model.graph.input[0].name
def transform_fn(data_item):
images, _ = data_item
return {input_name: images.numpy()}
ignored_scope = nncf.IgnoredScope(
subgraphs=[
nncf.Subgraph(inputs=["/backbone/blocks.5/blocks.5.8/Add",
],
outputs=["/backbone/blocks.5/blocks.5.15/Add"]),
nncf.Subgraph(inputs=["/GlobalAveragePool", #/backbone/blocks.5/blocks.5.10/Add
"/Shape"],
outputs=["/Div"]),
]
)
calibration_dataset = nncf.Dataset(val_loader, transform_fn)
onnx_quantized_model = nncf.quantize(model,
calibration_dataset,
subset_size=200,
preset=nncf.QuantizationPreset.MIXED,
ignored_scope=ignored_scope,
)
int8_model_path = "models/template_feature/onnx/model_int8_200_ignore_508_515.onnx"
onnx.save(onnx_quantized_model, int8_model_path)
The model is vit-base from https://github.com/lucidrains/vit-pytorch If you need other infomation, please let me know. Thanks in advance!!
Environment
PRETTY_NAME="Ubuntu 24.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.1 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
Minimal Reproducible Example
No response
Are you going to submit a PR?
- [x] Yes I'd like to help by submitting a PR!
@andrey-churkin , please take a look
@qiuzhewei Hi, thanks for reporting the issues. May I ask if you know which part of your model should be ignored (i.e., excluded from quantization)?
Could you please also provide the script showing how the isc_ft_v107.onnx model was obtained?
@andrey-churkin There is not a specific part of model that I want to quantize. I am trying to do some experiments on minimizing the accuracy drop that PTQ introduces. I mannually specify different parts of models unitl the accuracy drops is acceptable. So basically this is a trade-off between accuracy and parts of model quantized.
I also tried quantization_with_accuracy_control, but again, it was super slow, took two days but still not converage. So my purpose is to start from a 'heavy' quantized model and then gradually to a 'light' quantized until I reach the boundary that introduces minimal accuracy drop.
The model is obtained from https://github.com/lyakaap/ISC21-Descriptor-Track-1st/blob/master/isc_feature_extractor/model.py, isc_ft_v107.onnx is the onnx version of pretrained model in line 11.