coremltools CoreML Conversion hangs with minimal pipeline when 16-bit cast pass is specified

🐞Describing the bug

When converting a model with a minimal pipeline, adding 16-bit passes causes the conversion to hang indefinitely, and it can only be stopped with an external kill. This also happens in the default pipeline when compute_precision=ct.precision.FLOAT16 is specified.

To Reproduce

import coremltools as ct

def convert_to_coreml_with_consistent_results(model):
    pipeline = ct.PassPipeline(
        pass_names=[
            "common::const_elimination",
            "common::const_deduplication",
            "common::remove_symbolic_reshape",
            "common::noop_elimination",
            "common::merge_consecutive_relus",
            "common::merge_consecutive_reshapes",
            "common::merge_consecutive_transposes",
            "common::dedup_op_and_var_names",
            "common::dead_code_elimination",  # always end with dce
        ],
        pipeline_name="minimal_pipeline",
    )
    ml_model = ct.convert(
        model,
        outputs=[
            ct.TensorType(
                name="depth_b1hw",
            ),
            ct.TensorType(
                name="mask_logits_b1hw",
            ),
        ],
        debug=False,
        compute_units=ct.ComputeUnit.ALL,
        compute_precision=ct.precision.FLOAT32,
        minimum_deployment_target=ct.target.iOS17,
        pass_pipeline=pipeline,
    )

    ml_model.save("ConsistentResults.mlmodel")


def convert_to_coreml_hangs(model):
    pipeline = ct.PassPipeline(
        pass_names=[
            "common::const_elimination",
            "common::const_deduplication",
            "common::remove_symbolic_reshape",
            "common::noop_elimination",
            "common::merge_consecutive_relus",
            "common::merge_consecutive_reshapes",
            "common::merge_consecutive_transposes",
            "common::dedup_op_and_var_names",
            "common::add_fp16_cast",
            "common::add_int16_cast",
            "common::update_output_dtypes",
            "common::dead_code_elimination",
        ],
        pipeline_name="minimal_pipeline",
    )
    ml_model = ct.convert(
        model,
        outputs=[
            ct.TensorType(
                name="depth_b1hw",
            ),
            ct.TensorType(
                name="mask_logits_b1hw",
            ),
        ],
        compute_units=ct.ComputeUnit.ALL,
        compute_precision=ct.precision.FLOAT16,
        minimum_deployment_target=ct.target.iOS17,
        pass_pipeline=pipeline,
    )

    # we never get here but just in case
    ml_model.save("Hangs.mlmodel")

System environment (please complete the following information):

coremltools version: 8.3.0
OS (e.g. MacOS version or Linux type): MacOS Version 15.7.1 (24G231)
Any other relevant version information (e.g. PyTorch or TensorFlow version): torch==2.5.1 torchvision==0.20.1

Nov 11 '25 16:11 jgibson2

Sample of python3.12 model conversion.txt

I'm attaching here a sample of the process trace. Seems like the model verification in ANECompiler is taking too long.

Nov 13 '25 22:11 jgibson2

Coremltools 8.3.0 is several months old. We recently released coremltools 9.0. Do you still get this issue with coremltools 9.0?

Nov 14 '25 19:11 TobyRoseman

Coremltools 8.3.0 is several months old. We recently released coremltools 9.0. Do you still get this issue with coremltools 9.0?

Coremltools 9.0 official release I have the same issue. I took another sample and it's hanging in the same spot.

Nov 14 '25 19:11 jgibson2

Can you reproduce this issue with a toy model (i.e. a model defined via a small amount of code which can be copy and pasted)? Without your model there is no way for others to reproduce the issue.

I'd also recommend narrowing it down to a single pass, if you can.

Is this still an issue on macOS 26?

Nov 14 '25 22:11 TobyRoseman

I'll work on a minimal example. As far as a single pass, adding common::add_fp16_cast causes it to hang.

Nov 15 '25 16:11 jgibson2

Can you reproduce this issue with a toy model (i.e. a model defined via a small amount of code which can be copy and pasted)? Without your model there is no way for others to reproduce the issue.

I'd also recommend narrowing it down to a single pass, if you can.

Is this still an issue on macOS 26?

I just upgraded to macOS 26 and the same problem is still present.

Nov 17 '25 16:11 jgibson2

Can you reproduce this issue with a toy model (i.e. a model defined via a small amount of code which can be copy and pasted)? Without your model there is no way for others to reproduce the issue.

I'd also recommend narrowing it down to a single pass, if you can.

Is this still an issue on macOS 26?

@TobyRoseman here is a standalone example that reproduces the problem on my machine:

import torch
import numpy as np
import coremltools as ct

class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, a, b):
        return torch.einsum("bchw,bkchw->bhw", a, b)


def convert_to_coreml_with_consistent_results(model):
    pipeline = ct.PassPipeline(
        pass_names=[
            "common::const_elimination",
            "common::const_deduplication",
            "common::remove_symbolic_reshape",
            "common::noop_elimination",
            "common::merge_consecutive_relus",
            "common::merge_consecutive_reshapes",
            "common::merge_consecutive_transposes",
            "common::dedup_op_and_var_names",
            "common::dead_code_elimination",  # always end with dce
        ],
        pipeline_name="minimal_pipeline",
    )
    ml_model = ct.convert(
        model,
        inputs=[
            ct.TensorType(
                name="a",
                shape=(1, 16, 240, 320),
                dtype=np.float32,
            ),
            ct.TensorType(
                name="b",
                shape=(1, 4, 16, 240, 320),
                dtype=np.float32,
            ),
        ],
        outputs=[
            ct.TensorType(
                name="result_bhw",
            ),
        ],
        debug=False,
        compute_units=ct.ComputeUnit.ALL,
        compute_precision=ct.precision.FLOAT32,
        minimum_deployment_target=ct.target.iOS17,
        pass_pipeline=pipeline,
    )

    ml_model.save("ConsistentResults.mlpackage")


def convert_to_coreml_hangs(model):
    pipeline = ct.PassPipeline(
        pass_names=[
            "common::const_elimination",
            "common::const_deduplication",
            "common::remove_symbolic_reshape",
            "common::noop_elimination",
            "common::merge_consecutive_relus",
            "common::merge_consecutive_reshapes",
            "common::merge_consecutive_transposes",
            "common::dedup_op_and_var_names",
            "common::add_fp16_cast",
            "common::add_int16_cast",
            "common::update_output_dtypes",
            "common::dead_code_elimination",
        ],
        pipeline_name="minimal_pipeline",
    )
    ml_model = ct.convert(
        model,
        inputs=[
            ct.TensorType(
                name="a",
                shape=(1, 16, 240, 320),
                dtype=np.float32,
            ),
            ct.TensorType(
                name="b",
                shape=(1, 4, 16, 240, 320),
                dtype=np.float32,
            ),
        ],
        outputs=[
            ct.TensorType(
                name="result_bhw",
            ),
        ],
        compute_units=ct.ComputeUnit.ALL,
        compute_precision=ct.precision.FLOAT16,
        minimum_deployment_target=ct.target.iOS17,
        pass_pipeline=pipeline,
    )

    # we never get here but just in case
    ml_model.save("Hangs.mlpackage")


if __name__ == "__main__":
    model = Model()
    example_inputs = (torch.randn(1, 16, 240, 320), torch.randn(1, 4, 16, 240, 320))
    traced_model = torch.export.export(model, example_inputs, dynamic_shapes=None, strict=True)
    convert_to_coreml_with_consistent_results(traced_model)
    convert_to_coreml_hangs(traced_model)

Nov 18 '25 20:11 jgibson2

This issue is also (maybe) linked to this apple feedback report I submitted; the reason we are using this particular mix of passes is due to inconsistent results between iPhone versions with the default pipeline.

https://feedbackassistant.apple.com/feedback/21000989

Nov 18 '25 20:11 jgibson2

@TobyRoseman is this an issue with coremltools or an issue with coreml itself?

Dec 04 '25 04:12 jgibson2