CoreML Conversion hangs with minimal pipeline when 16-bit cast pass is specified
🐞Describing the bug
When converting a model with a minimal pipeline, adding 16-bit passes causes the conversion to hang indefinitely, and it can only be stopped with an external kill. This also happens in the default pipeline when compute_precision=ct.precision.FLOAT16 is specified.
To Reproduce
import coremltools as ct
def convert_to_coreml_with_consistent_results(model):
pipeline = ct.PassPipeline(
pass_names=[
"common::const_elimination",
"common::const_deduplication",
"common::remove_symbolic_reshape",
"common::noop_elimination",
"common::merge_consecutive_relus",
"common::merge_consecutive_reshapes",
"common::merge_consecutive_transposes",
"common::dedup_op_and_var_names",
"common::dead_code_elimination", # always end with dce
],
pipeline_name="minimal_pipeline",
)
ml_model = ct.convert(
model,
outputs=[
ct.TensorType(
name="depth_b1hw",
),
ct.TensorType(
name="mask_logits_b1hw",
),
],
debug=False,
compute_units=ct.ComputeUnit.ALL,
compute_precision=ct.precision.FLOAT32,
minimum_deployment_target=ct.target.iOS17,
pass_pipeline=pipeline,
)
ml_model.save("ConsistentResults.mlmodel")
def convert_to_coreml_hangs(model):
pipeline = ct.PassPipeline(
pass_names=[
"common::const_elimination",
"common::const_deduplication",
"common::remove_symbolic_reshape",
"common::noop_elimination",
"common::merge_consecutive_relus",
"common::merge_consecutive_reshapes",
"common::merge_consecutive_transposes",
"common::dedup_op_and_var_names",
"common::add_fp16_cast",
"common::add_int16_cast",
"common::update_output_dtypes",
"common::dead_code_elimination",
],
pipeline_name="minimal_pipeline",
)
ml_model = ct.convert(
model,
outputs=[
ct.TensorType(
name="depth_b1hw",
),
ct.TensorType(
name="mask_logits_b1hw",
),
],
compute_units=ct.ComputeUnit.ALL,
compute_precision=ct.precision.FLOAT16,
minimum_deployment_target=ct.target.iOS17,
pass_pipeline=pipeline,
)
# we never get here but just in case
ml_model.save("Hangs.mlmodel")
System environment (please complete the following information):
- coremltools version: 8.3.0
- OS (e.g. MacOS version or Linux type): MacOS Version 15.7.1 (24G231)
- Any other relevant version information (e.g. PyTorch or TensorFlow version): torch==2.5.1 torchvision==0.20.1
Sample of python3.12 model conversion.txt
I'm attaching here a sample of the process trace. Seems like the model verification in ANECompiler is taking too long.
Coremltools 8.3.0 is several months old. We recently released coremltools 9.0. Do you still get this issue with coremltools 9.0?
Coremltools 8.3.0 is several months old. We recently released coremltools 9.0. Do you still get this issue with coremltools 9.0?
Coremltools 9.0 official release I have the same issue. I took another sample and it's hanging in the same spot.
Can you reproduce this issue with a toy model (i.e. a model defined via a small amount of code which can be copy and pasted)? Without your model there is no way for others to reproduce the issue.
I'd also recommend narrowing it down to a single pass, if you can.
Is this still an issue on macOS 26?
I'll work on a minimal example. As far as a single pass, adding common::add_fp16_cast causes it to hang.
Can you reproduce this issue with a toy model (i.e. a model defined via a small amount of code which can be copy and pasted)? Without your model there is no way for others to reproduce the issue.
I'd also recommend narrowing it down to a single pass, if you can.
Is this still an issue on macOS 26?
I just upgraded to macOS 26 and the same problem is still present.
Can you reproduce this issue with a toy model (i.e. a model defined via a small amount of code which can be copy and pasted)? Without your model there is no way for others to reproduce the issue.
I'd also recommend narrowing it down to a single pass, if you can.
Is this still an issue on macOS 26?
@TobyRoseman here is a standalone example that reproduces the problem on my machine:
import torch
import numpy as np
import coremltools as ct
class Model(torch.nn.Module):
def __init__(self):
super().__init__()
def forward(self, a, b):
return torch.einsum("bchw,bkchw->bhw", a, b)
def convert_to_coreml_with_consistent_results(model):
pipeline = ct.PassPipeline(
pass_names=[
"common::const_elimination",
"common::const_deduplication",
"common::remove_symbolic_reshape",
"common::noop_elimination",
"common::merge_consecutive_relus",
"common::merge_consecutive_reshapes",
"common::merge_consecutive_transposes",
"common::dedup_op_and_var_names",
"common::dead_code_elimination", # always end with dce
],
pipeline_name="minimal_pipeline",
)
ml_model = ct.convert(
model,
inputs=[
ct.TensorType(
name="a",
shape=(1, 16, 240, 320),
dtype=np.float32,
),
ct.TensorType(
name="b",
shape=(1, 4, 16, 240, 320),
dtype=np.float32,
),
],
outputs=[
ct.TensorType(
name="result_bhw",
),
],
debug=False,
compute_units=ct.ComputeUnit.ALL,
compute_precision=ct.precision.FLOAT32,
minimum_deployment_target=ct.target.iOS17,
pass_pipeline=pipeline,
)
ml_model.save("ConsistentResults.mlpackage")
def convert_to_coreml_hangs(model):
pipeline = ct.PassPipeline(
pass_names=[
"common::const_elimination",
"common::const_deduplication",
"common::remove_symbolic_reshape",
"common::noop_elimination",
"common::merge_consecutive_relus",
"common::merge_consecutive_reshapes",
"common::merge_consecutive_transposes",
"common::dedup_op_and_var_names",
"common::add_fp16_cast",
"common::add_int16_cast",
"common::update_output_dtypes",
"common::dead_code_elimination",
],
pipeline_name="minimal_pipeline",
)
ml_model = ct.convert(
model,
inputs=[
ct.TensorType(
name="a",
shape=(1, 16, 240, 320),
dtype=np.float32,
),
ct.TensorType(
name="b",
shape=(1, 4, 16, 240, 320),
dtype=np.float32,
),
],
outputs=[
ct.TensorType(
name="result_bhw",
),
],
compute_units=ct.ComputeUnit.ALL,
compute_precision=ct.precision.FLOAT16,
minimum_deployment_target=ct.target.iOS17,
pass_pipeline=pipeline,
)
# we never get here but just in case
ml_model.save("Hangs.mlpackage")
if __name__ == "__main__":
model = Model()
example_inputs = (torch.randn(1, 16, 240, 320), torch.randn(1, 4, 16, 240, 320))
traced_model = torch.export.export(model, example_inputs, dynamic_shapes=None, strict=True)
convert_to_coreml_with_consistent_results(traced_model)
convert_to_coreml_hangs(traced_model)
This issue is also (maybe) linked to this apple feedback report I submitted; the reason we are using this particular mix of passes is due to inconsistent results between iPhone versions with the default pipeline.
https://feedbackassistant.apple.com/feedback/21000989
@TobyRoseman is this an issue with coremltools or an issue with coreml itself?