Quantization hangs, even for 100 imges
Hardware: GPU: Intel(R) UHD Graphics 620 Processor: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz, 1800 Mhz, 4 Core(s), 8 Logical Processor(s) OS: Windows 10 Software: Python 3.10.0 nncf 2.17.0 openvino 2025.2.0
I am trying to quantize codeformer, a float32, opset 13, onnx model to INT8 format for performance gain. Model Inputs: Name: x, Shape: [0, 3, 512, 512], Data Type: FLOAT Name: w, Shape: [], Data Type: DOUBLE
A collection of 200 images (512x512) has been created. Here is the script
import numpy as np
from PIL import Image
import os
import nncf
from nncf import Dataset
import onnx
# Load your ONNX model
onnx_model = onnx.load("codeformer_opset13.onnx")
def preload_calibration_dataset(images_dir):
data = []
for filename in os.listdir(images_dir):
if filename.lower().endswith((".png", ".jpg", ".jpeg")):
img_path = os.path.join(images_dir, filename)
img = Image.open(img_path).convert("RGB").resize((512, 512))
img_np = np.array(img).astype(np.float32) / 255.0
img_np = img_np.transpose(2, 0, 1)
img_np = np.expand_dims(img_np, axis=0)
data.append({
"x": img_np,
"w": np.array(1.0, dtype=np.float64)
})
print(f"Loaded {len(data)} images for calibration.")
return data
# Wrap with nncf.Dataset
dataset = Dataset(preload_calibration_dataset("faces/class1"))
# Run quantization
quantized_model = nncf.quantize(onnx_model, dataset)
# Save the quantized ONNX model
onnx.save(quantized_model, "codeformer_int8.onnx")
The process starts and immediately hangs my laptop as soon as statistical collection starts processing the first image . Here is the captured image:
I am getting my hands on NNCF for the very first time. Am I missing something here or it's just the hardware limitation causing the issue?
@andrey-churkin , please analyze it.
Hi @prashant-saxena,
Thanks for reporting the issue. Could you please provide more information about the following:
- How was the
codeformer_opset13.onnxmodel obtained? Were any scripts or guidelines used? - How much RAM does your system have?
I have been using codeformer.onnx for quite a long time. I am not sure but I think since it's release. That time I did a conversion from .pth to .onnx and later to openvino from this version. I am teaching basics of Machine Learning and AI to school and university students. Unfortunately most of them don't have access to professional graphic card. They have a decent laptop with multi code Intel based CPU with integrated GPU and RAM in between 8-16 GB. Based on personal experience openvino is performing best in the above mentioned hardware. float16 models are performing faster then float32, that's why I decided to push the limit by testing INT8 conversion. I tried above python code on two systems with:
- Processor :: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz, 1800 Mhz, 4 Core(s), 8 Logical Processor(s)
- 8 & 16 GB of RAM On both of them, the script crashes the system. NNCF was complaining about the onnx model we are using because of a lower opset version. I have increased the opset version version to 13.
from onnx import version_converter
converted = version_converter.convert_version(onnx_model, 13)
onnx.save(converted, "codeformer_opset13.onnx")
I do have another issue also 26391 and I believe these two might be related. Please have a look.
I have also tried an iterator to feed the data to NNCF one by one but that also have failed:
# Create a simple custom Dataset iterator
def calibration_dataset(images_dir, input_shape=(3, 512, 512)):
for filename in os.listdir(images_dir):
if filename.lower().endswith((".png", ".jpg", ".jpeg")):
img_path = os.path.join(images_dir, filename)
img = Image.open(img_path).convert("RGB").resize((512, 512))
img_np = np.array(img).astype(np.float32) / 255.0
img_np = img_np.transpose(2, 0, 1) # HWC → CHW
img_np = np.expand_dims(img_np, axis=0) # [1, 3, 512, 512]
yield {
"x": img_np,
"w": np.array(1.0, dtype=np.float64) # Scalar double input
}