coremltools 8.0b1 linear_quantize_weights is not working for models converted with minimum_deployment_target=ct.target.iOS18
🐞Describing the bug
- Make sure to only create an issue here for bugs in the coremltools Python package. If this is a bug with the Core ML Framework or Xcode, please submit your bug here: https://developer.apple.com/bug-reporting/
- Provide a clear and consise description of the bug.
I'm testing 8.0 beta 1 for linear quantization , if model is converted with with minimum_deployment_target=ct.target.iOS18, linear_quantize_weights does not perform weight quantization ( checking by loading model in Xcode, storage remains Float16, Computer missing int8
if I change minimum_deployment_target to ios17 quantization is correct, Xcode shows models with int8
Stack Trace
- If applicable, please paste the complete stack trace.
To Reproduce
- Please add a minimal code example that can reproduce the error when running it.
#iOS18 TEST linear quant int8, int4
import torch
import torch.nn as nn
import torch.nn.functional as F
import coremltools as ct
import coremltools.optimize as cto
import numpy as np
SIZE = 224
# Set the seed for reproducibility
seed = 42
# Define a simple layer module we'll reuse in our network.
class Layer(nn.Module):
def __init__(self, in_channels: int, out_channels: int):
super(Layer, self).__init__()
self.linear = nn.Linear(in_channels, out_channels)
def forward(self, x: torch.Tensor) -> torch.Tensor:
batch_size, channels, height, width = x.shape
x = x.view(batch_size, channels, -1) # Flatten the height and width
x = x.permute(0, 2, 1) # Rearrange for linear layer
x = self.linear(x)
x = x.permute(0, 2, 1).reshape(batch_size, -1, height, width) # Reshape back to the original dimensions
x = F.relu(x)
x = F.max_pool2d(x, (2, 2))
return x
# A simple network consisting of several base layers.
class SimpleNet(nn.Module):
def __init__(self):
super(SimpleNet, self).__init__()
self.layer1 = Layer(3, 6)
self.layer2 = Layer(6, 16)
self.classifier = nn.Linear(16 * 56 * 56, 8) # Initialize the classifier with correct input size
def forward(self, x: torch.Tensor) -> torch.Tensor:
x = self.layer1(x)
x = self.layer2(x)
x = 5.0 * x
x = 6.3 * x
x = x.reshape(x.size(0), -1) # Flatten the tensor before the classifier
x = self.classifier(x)
x = F.log_softmax(x, dim=1) # Use log_softmax for classification
return x
# Create the model instance
model = SimpleNet()
model.eval() # Set the model to evaluation mode
# Prepare example input
example_input = torch.randn(1, 3, SIZE, SIZE)
# Trace the model
print("Trace the model")
try:
traced_model = torch.jit.trace(model, example_input)
except Exception as e:
print(f"Tracing failed: {e}")
print("Attempting to script the model instead")
traced_model = torch.jit.script(model)
print("Convert to CoreML")
coreml_model = ct.convert(
traced_model,
minimum_deployment_target=ct.target.iOS17,
inputs=[ct.ImageType(name="input_1", shape=example_input.shape)],
)
print("Quantize model -8")
config = cto.coreml.OptimizationConfig()
global_config = cto.coreml.OpLinearQuantizerConfig(
mode="linear_symmetric", dtype=np.int8, weight_threshold=127
)
config.set_global(global_config)
Xmodel = cto.coreml.linear_quantize_weights(coreml_model, config)
print("Saving models")
Xmodel.save("newmodel-A81-8.mlpackage")
coreml_model.save("newmodel-A81-16.mlpackage")
print("Convert to CoreML 8.0/iOS18")
coreml_model = ct.convert(
traced_model,
minimum_deployment_target=ct.target.iOS18,
inputs=[ct.ImageType(name="input_1", shape=example_input.shape)],
)
print("Quantize model-4")
config = cto.coreml.OptimizationConfig()
global_config = cto.coreml.OpLinearQuantizerConfig(
mode="linear_symmetric", dtype=ct.converters.mil.mil.types.int8, weight_threshold=64
)
config.set_global(global_config)
Xmodel = cto.coreml.linear_quantize_weights(coreml_model, config)
Xmodel.save("newmodel-A81-4.mlpackage")
print("CoreML model saved")
# Paste Python code snippet here, complete with any required import statements.
- If the model conversion succeeds, but there is a numerical mismatch in predictions, please include the code used for comparisons.
System environment (please complete the following information):
- coremltools version:
- OS (e.g. MacOS version or Linux type):
- Any other relevant version information (e.g. PyTorch or TensorFlow version): macOS 15 beta
Additional context
- Add anything else about the problem here that you want to share.
@dessatel, Thank you so much for reporting this issue with the detailed steps.
I confirmed that I can reproduce this. Just want to mention that if you check the quantized model disk size, it matches the iOS17 version. It's more like a Xcode display issue and we are working on it. Thanks!
Thank you , could you confirm int4 works as well ? iOS8 target is required for sure in this case. This was the main motivation for the test, and it was producing the same issue. Is there way to check model's supported data types via some meta-data call from coremltools ?
You could use the model size to check if the int4 is actually used (for example, it should have around 1/2 of the int8 model size).
Indeed, size is 1/2 for 4-bit lin quant, and MD5 for weight.bin is the same for iOS17 and iOS18 export for 8 bt. Exporting model's Specs I can see INT4 reference for 4-bit quant. So Xcode issue is likely value { arguments { value { type { tensorType { dataType: INT4 rank: 2 dimensions { constant { size: 16 } } dimensions { constant { size: 6 } } } } blobFileValue { fileName: "@model_path/weights/weight.bin" offset: 192 } }
FB14257215 Xcode beta 3 has the same issue
This issue is resolved in Xcode beta 4, both Int4 and int8 are showing up correctly Version 16.0 beta 4 (16A5211f) FB14257215