coremltools icon indicating copy to clipboard operation
coremltools copied to clipboard

coremltools 8.0b1 linear_quantize_weights is not working for models converted with minimum_deployment_target=ct.target.iOS18

Open dessatel opened this issue 1 year ago • 4 comments

🐞Describing the bug

  • Make sure to only create an issue here for bugs in the coremltools Python package. If this is a bug with the Core ML Framework or Xcode, please submit your bug here: https://developer.apple.com/bug-reporting/
  • Provide a clear and consise description of the bug.

I'm testing 8.0 beta 1 for linear quantization , if model is converted with with minimum_deployment_target=ct.target.iOS18, linear_quantize_weights does not perform weight quantization ( checking by loading model in Xcode, storage remains Float16, Computer missing int8

if I change minimum_deployment_target to ios17 quantization is correct, Xcode shows models with int8

Stack Trace

  • If applicable, please paste the complete stack trace.

To Reproduce

  • Please add a minimal code example that can reproduce the error when running it.
#iOS18 TEST linear quant int8, int4
import torch
import torch.nn as nn
import torch.nn.functional as F
import coremltools as ct
import coremltools.optimize as cto
import numpy as np

SIZE = 224
# Set the seed for reproducibility
seed = 42

# Define a simple layer module we'll reuse in our network.
class Layer(nn.Module):
    def __init__(self, in_channels: int, out_channels: int):
        super(Layer, self).__init__()
        self.linear = nn.Linear(in_channels, out_channels)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        batch_size, channels, height, width = x.shape
        x = x.view(batch_size, channels, -1)  # Flatten the height and width
        x = x.permute(0, 2, 1)  # Rearrange for linear layer
        x = self.linear(x)
        x = x.permute(0, 2, 1).reshape(batch_size, -1, height, width)  # Reshape back to the original dimensions
        x = F.relu(x)
        x = F.max_pool2d(x, (2, 2))
        return x

# A simple network consisting of several base layers.
class SimpleNet(nn.Module):
    def __init__(self):
        super(SimpleNet, self).__init__()
        self.layer1 = Layer(3, 6)
        self.layer2 = Layer(6, 16)
        self.classifier = nn.Linear(16 * 56 * 56, 8)  # Initialize the classifier with correct input size

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.layer1(x)
        x = self.layer2(x)
        x = 5.0 * x
        x = 6.3 * x
        x = x.reshape(x.size(0), -1)  # Flatten the tensor before the classifier
        x = self.classifier(x)
        x = F.log_softmax(x, dim=1)  # Use log_softmax for classification
        return x

# Create the model instance
model = SimpleNet()
model.eval()  # Set the model to evaluation mode

# Prepare example input
example_input = torch.randn(1, 3, SIZE, SIZE)

# Trace the model
print("Trace the model")
try:
    traced_model = torch.jit.trace(model, example_input)
except Exception as e:
    print(f"Tracing failed: {e}")
    print("Attempting to script the model instead")
    traced_model = torch.jit.script(model)

print("Convert to CoreML")
coreml_model = ct.convert(
    traced_model,
    minimum_deployment_target=ct.target.iOS17,
    inputs=[ct.ImageType(name="input_1", shape=example_input.shape)],
)


print("Quantize model -8")
config = cto.coreml.OptimizationConfig()
global_config = cto.coreml.OpLinearQuantizerConfig(
    mode="linear_symmetric", dtype=np.int8, weight_threshold=127
)
config.set_global(global_config)
Xmodel = cto.coreml.linear_quantize_weights(coreml_model, config)

print("Saving models")
Xmodel.save("newmodel-A81-8.mlpackage")
coreml_model.save("newmodel-A81-16.mlpackage")


print("Convert to CoreML 8.0/iOS18")
coreml_model = ct.convert(
    traced_model,
    minimum_deployment_target=ct.target.iOS18,
    inputs=[ct.ImageType(name="input_1", shape=example_input.shape)],
)

print("Quantize model-4")
config = cto.coreml.OptimizationConfig()
global_config = cto.coreml.OpLinearQuantizerConfig(
    mode="linear_symmetric", dtype=ct.converters.mil.mil.types.int8, weight_threshold=64
)

config.set_global(global_config)
Xmodel = cto.coreml.linear_quantize_weights(coreml_model, config)
Xmodel.save("newmodel-A81-4.mlpackage")


print("CoreML model saved")

# Paste Python code snippet here, complete with any required import statements.
  • If the model conversion succeeds, but there is a numerical mismatch in predictions, please include the code used for comparisons.

System environment (please complete the following information):

  • coremltools version:
  • OS (e.g. MacOS version or Linux type):
  • Any other relevant version information (e.g. PyTorch or TensorFlow version): macOS 15 beta

Additional context

  • Add anything else about the problem here that you want to share.

dessatel avatar Jun 21 '24 01:06 dessatel

@dessatel, Thank you so much for reporting this issue with the detailed steps.

I confirmed that I can reproduce this. Just want to mention that if you check the quantized model disk size, it matches the iOS17 version. It's more like a Xcode display issue and we are working on it. Thanks!

junpeiz avatar Jun 21 '24 17:06 junpeiz

Thank you , could you confirm int4 works as well ? iOS8 target is required for sure in this case. This was the main motivation for the test, and it was producing the same issue. Is there way to check model's supported data types via some meta-data call from coremltools ?

dessatel avatar Jun 21 '24 17:06 dessatel

You could use the model size to check if the int4 is actually used (for example, it should have around 1/2 of the int8 model size).

junpeiz avatar Jun 21 '24 19:06 junpeiz

Indeed, size is 1/2 for 4-bit lin quant, and MD5 for weight.bin is the same for iOS17 and iOS18 export for 8 bt. Exporting model's Specs I can see INT4 reference for 4-bit quant. So Xcode issue is likely value { arguments { value { type { tensorType { dataType: INT4 rank: 2 dimensions { constant { size: 16 } } dimensions { constant { size: 6 } } } } blobFileValue { fileName: "@model_path/weights/weight.bin" offset: 192 } }

dessatel avatar Jun 22 '24 07:06 dessatel

FB14257215 Xcode beta 3 has the same issue

dessatel avatar Jul 10 '24 00:07 dessatel

This issue is resolved in Xcode beta 4, both Int4 and int8 are showing up correctly Version 16.0 beta 4 (16A5211f) FB14257215

dessatel avatar Jul 24 '24 02:07 dessatel