neural-compressor
neural-compressor copied to clipboard
'q_config' is needed when export an INT8 model
Hi,
I want to convert and quantize Pytorch model to ONNX model. I refer to this example https://github.com/intel/neural-compressor/blob/master/examples/pytorch/image_recognition/torchvision_models/export/fx/main.py When calling export function, there is error "'q_config' is needed when export an INT8 model" I don't see anything about q_config in example code. May I know how to solve this issue?
Here is my code: if name == "main": model = timm.create_model('resnet50.a1_in1k', pretrained=True) model = model.eval()
val_dataset = SampleDataset('golden_image')
val_loader = torch.utils.data.DataLoader(
val_dataset,
batch_size=1, shuffle=False,
num_workers=1, pin_memory=True)
conf = PostTrainingQuantConfig(approach='static')
q_model = quantization.fit(model,
conf,
calib_dataloader=val_loader) # Don't need tuning.
int8_onnx_config = Torch2ONNXConfig(
dtype="int8",
opset_version=14,
quant_format="QDQ",
example_inputs=torch.randn(1, 3, 224, 224),
input_names=['input'],
output_names=['output'],
dynamic_axes={"input": {0: "batch_size"},
"output": {0: "batch_size"}},
)
inc_model = Model(q_model)
inc_model.export("resnet50_int8.onnx", int8_onnx_config)
Error message: neural_compressor\experimental\export\torch2onnx.py", line 389, in torch_to_int8_onnx assert q_config is not None, "'q_config' is needed when export an INT8 model." AssertionError: 'q_config' is needed when export an INT8 model.
torch.version '2.2.2+cpu' neural_compressor.version '2.6'
@ZhangShuoAlreadyExists We will check it and feedback as soon!
@ZhangShuoAlreadyExists I create an example as your code. It's passed. Could you refer to it?
Install packages:
pip install neural_compressor
pip install torch torchvision onnxruntime onnx
Code:
import argparse
import os
import random
import shutil
import time
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.distributed as dist
import torch.optim
import torch.multiprocessing as mp
import torch.utils.data
import torch.utils.data.distributed
import torchvision
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torchvision.models as models
import torch
from neural_compressor import PostTrainingQuantConfig
from neural_compressor import quantization
from neural_compressor.config import Torch2ONNXConfig
from neural_compressor.model import Model
model = torchvision.models.resnet50(pretrained=True)
model = model.eval()
Transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=(0.5, 0.5, 0.5),
std=(0.5, 0.5, 0.5))])
test_data = datasets.CIFAR10(
root="data",
train=False,
download=True,
transform=Transform,
)
val_loader = torch.utils.data.DataLoader(
test_data,
batch_size=1, shuffle=False,
num_workers=1, pin_memory=True)
conf = PostTrainingQuantConfig(approach='static')
q_model = quantization.fit(model,
conf,
calib_dataloader=val_loader) # Don't need tuning.
int8_onnx_config = Torch2ONNXConfig(
dtype="int8",
opset_version=14,
quant_format="QDQ",
example_inputs=torch.randn(1, 3, 224, 224),
input_names=['input'],
output_names=['output'],
dynamic_axes={"input": {0: "batch_size"},
"output": {0: "batch_size"}},
)
# inc_model = Model(q_model)
q_model.export("resnet50_int8.onnx", int8_onnx_config)
@ZhangShuoAlreadyExists Could you confirm if our answer support you?
Thank you!