executorch icon indicating copy to clipboard operation
executorch copied to clipboard

CoreML Partitioner is not able to lower mv3

Open mergennachin opened this issue 7 months ago • 4 comments

🐛 Describe the bug

    model = models.mobilenet_v3_small(weights="DEFAULT").eval()
    sample_inputs = (torch.randn(1, 3, 224, 224),)


    et_program_coreml = to_edge_transform_and_lower(
        torch.export.export(model, sample_inputs),
        partitioner=[CoreMLPartitioner()],
    ).to_executorch()


   with open("mv3_coreml_all.pte", "wb") as file:
        et_program_coreml.write_to_file(file)

Even though it is able to generate a file, it is spewing so much error. And during runtime it is crashing.

https://gist.github.com/mergennachin/74ca8ef593bc6c962d8d1baacaede2ed

On the other hand,

python3 -m executorch.examples.apple.coreml.scripts.export --model_name=mv3

is fine because it has many layers of patches to make CoreML work

Versions

executorch==0.6.0

cc @kimishpatel @YifanShenSZ @cymbalrush @metascroy @byjlw

mergennachin avatar Apr 24 '25 21:04 mergennachin

Do we have something close to this in CI? Like a quantizer variant perhaps?

digantdesai avatar Apr 24 '25 22:04 digantdesai

@digantdesai @Gasoonjia

@shoumikhin had to disable dim order https://github.com/pytorch-labs/executorch-examples/pull/23/files when exporting

mergennachin avatar Apr 25 '25 15:04 mergennachin

I took a closer look.

When dim order is enabled (now the default), this model has “executorch.exir.dialects.edge._ops.dim_order_ops._to_dim_order_copy.default” ops that return floats, and this op is not recognized by CoreML (https://github.com/apple/coremltools/blob/main/coremltools/converters/mil/frontend/torch/ops.py), so the partitioner skips them. But this results in a delegate call that passes in a bunch of floats as inputs.

coremltools wraps these floats in Rank1 tensors at compile time, but to ExecuTorch they are still floats. At runtime, ExecuTorch forwards these floats to the CoreML delegate, but the model complains it hasn't received the wrapped tensor inputs (which results in an error).

When dim order is disabled, we see “executorch.exir.dialects.edge._ops.aten._to_copy.default” instead in the graph and this is supported by CoreML and grabbed by the partitioner. So the delegate only has a tensor input and things work fine.

Disabled dim order for this model is a short term fix. Longer term, we should 1) add support for _to_dim_order_copy to coremltools, and 2) handle scalars in ET CoreML’s runtime in the same way they’re handled by coremltools at compile time (i.e., wrap them in rank 1 tensors). Either one of these fixes would solve the problem for this model, but we should probably do both.

cc @YifanShenSZ @cymbalrush

metascroy avatar Apr 25 '25 19:04 metascroy

@digantdesai @Gasoonjia

@shoumikhin had to disable dim order https://github.com/pytorch-labs/executorch-examples/pull/23/files when exporting

In terms of why CI did not catch this when dim order was enabled by default, it does not look like we use partitioner in our test_models.sh script. We use the older to_backend API instead. To use partitioner, this arg needs to be set: https://github.com/pytorch/executorch/blob/main/examples/apple/coreml/scripts/export.py#L76

metascroy avatar Apr 25 '25 20:04 metascroy

  1. add support for _to_dim_order_copy to coremltools

Yep

digantdesai avatar Apr 29 '25 19:04 digantdesai

  1. add support for _to_dim_order_copy to coremltools

Yep

But that doesn't support the dim order op, so the partitioner will still skip it.

metascroy avatar Apr 29 '25 21:04 metascroy