coremltools icon indicating copy to clipboard operation
coremltools copied to clipboard

PyTorch __ior__ op is not implemented for conversion

Open kamisori-daijin opened this issue 4 months ago • 5 comments

Description

Hello,

I encountered an issue while trying to convert a PyTorch model (gemma-3-1b-it) to Core ML format. The conversion process failed with the following error: PyTorch convert function for op 'ior' not implemented..

I understand there was a recent fix related to an ior issue in RangeDim. I have confirmed that I am using the latest version of coremltools by installing directly from the main branch of this repository.

Steps to Reproduce

Install coremltools from the main branch: pip install git+https://github.com/apple/coremltools.git

Run the provided Python script with the gemma-3-1b-it model.

The conversion fails with the traceback shown below.

Environment

CoreMLTools Version: coremltools @ git+https://github.com/apple/coremltools.git@0f4244215c1f293f9b822b194fede05ad0e93851

PyTorch Version: 2.2.2

Python Version: 3.12.10

Additional Information

Here is the traceback I received:

Failed to load _MLModelProxy: No module named 'coremltools.libcoremlpython'
...
(Omitted - 'coremltools.libcoremlpython' related errors)
...
Fail to import BlobReader from libmilstoragepython. No module named 'coremltools.libmilstoragepython'
...
(Omitted - 'coremltools.libmilstoragepython' related errors)
...
Failed to load '_MLCPUComputeDeviceRemoteProxy'. Remote device functionality for retrieving the compute plan is unavailable.
...
(Omitted - 'RemoteProxy' related errors)
...
CoreMLTools Version: 9.0b1
PyTorch Version: 2.2.2
Numpy Version: 1.26.4
Loading Hugging Face model from '.../gemma-3-1b-it' into memory...
Model loaded and configured.
Wrapper model prepared.
Tracing wrapper model to TorchScript...
`loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`.
/Users/.../transformers/masking_utils.py:190: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect.
...
(Omitted - TracerWarning)
...
TorchScript tracing complete.
Converting TorchScript model to Core ML...
Model is not in eval mode. Consider calling '.eval()' on your model prior to conversion
Converting PyTorch Frontend ==> MIL Ops:   0%| | 0/4901 [00:00<Core ML embedding (gather) layer does not support any inputs besides the weights and indices. Those given will be ignored.
Converting PyTorch Frontend ==> MIL Ops:   0%| | 23/4901 [00:00

ERROR - converting '__ior__' op (located at: 'model/model/attention_mask.19'):

Converting PyTorch Frontend ==> MIL Ops:   1%| | 57/4901 [00:00
Conversion to CoreML failed: PyTorch convert function for op '__ior__' not implemented.

Here is my convert.py script:

import coremltools as ct
import numpy as np
import torch
import torch.nn as nn
from transformers import AutoTokenizer, AutoModelForCausalLM
import argparse
import os


def main():
    # Configuration for parsing command line arguments
    parser = argparse.ArgumentParser(description="Convert a Hugging Face model to Core ML.")
    parser.add_argument(
        "--model",
        type=str,
        required=True,
        help="Path to the downloaded Hugging Face model directory (e.g., 'gemma-3-1b-it')."
    )
    args = parser.parse_args()

    # Use the path received as a command line argument
    downloaded_hf_model_dir = args.model

    print(f"CoreMLTools Version: {ct.__version__}")
    print(f"PyTorch Version: {torch.__version__}")
    print(f"Numpy Version: {np.__version__}")

    try:
        # 1. Hugging Face model loading
        print(f"Loading Hugging Face model from '{downloaded_hf_model_dir}' into memory...")
        model = AutoModelForCausalLM.from_pretrained(downloaded_hf_model_dir, torch_dtype=torch.float16)
        model.eval()
        model.config.use_cache = False
        print("Model loaded and configured.")

        # 2. Create a wrapper model for Core ML conversion
        class GemmaCoreMLWrapper(nn.Module):
            def __init__(self, model):
                super().__init__()
                self.model = model
                self.model.config.use_cache = False

            def forward(self, input_ids, attention_mask):
                outputs = self.model(
                    input_ids=input_ids,
                    attention_mask=attention_mask,
                    use_cache=False,
                    return_dict=False,
                    output_attentions=False,
                    output_hidden_states=False
                )
                logits = outputs[0]
                return logits

        wrapped_model = GemmaCoreMLWrapper(model)
        print("Wrapper model prepared.")

        # 3. Prepare dummy inputs for TorchScript tracing
        max_seq_length = 1024
        tokenizer = AutoTokenizer.from_pretrained(downloaded_hf_model_dir)

        dummy_input_ids = torch.randint(0, tokenizer.vocab_size, (1, 10), dtype=torch.long)
        dummy_attention_mask = torch.ones(1, 10, dtype=torch.long)

        # 4. Trace the wrapper model to TorchScript
        print("Tracing wrapper model to TorchScript...")
        traced_model = torch.jit.trace(wrapped_model, (dummy_input_ids, dummy_attention_mask))
        print("TorchScript tracing complete.")

        # 5. Convert TorchScript model to Core ML
        print("Converting TorchScript model to Core ML...")
        coreml_model = ct.convert(
            traced_model,
            inputs=[
                ct.TensorType(name="input_ids", shape=(1, ct.RangeDim(upper_bound=max_seq_length)), dtype=np.int32),
                ct.TensorType(name="attention_mask", shape=(1, ct.RangeDim(upper_bound=max_seq_length)), dtype=np.int32)
            ],
            source="pytorch",
            convert_to="mlprogram",
            minimum_deployment_target=ct.target.iOS16
        )

        model_name = os.path.basename(downloaded_hf_model_dir)
        output_filename = f"{model_name}-coreml.mlpackage"
        coreml_model.save(output_filename)
        print(f"CoreML model saved successfully to {output_filename}.")

    except Exception as e:
        print(f"Conversion to CoreML failed: {e}")


if __name__ == "__main__":
    main()


How to Use convert.py

This script is designed to be run from the command line. You need to provide the path to the model you want to convert using the --model argument. Basic Command python convert.py --model "[path_to_your_model]"

Example

If your model is located at /path/your/directory/gemma-3-1b-it, the command would be: python convert.py --model "/path/your/directory/gemma-3-1b-it"

important Notes

Current Directory: Make sure you are in the project's root directory when you run the command.

kamisori-daijin avatar Aug 10 '25 03:08 kamisori-daijin

We would need an aten op to target here. What does torch tracing generate?

abhishek-patel6 avatar Aug 11 '25 23:08 abhishek-patel6

Thank you for the reply. I have tried to get the traced graph, but it does not contain the __ior__ op.

The torch.jit.trace fails with a RuntimeError due to the model's output being a dict (which I was able to work around by setting strict=False). However, even with the successful trace, the resulting graph does not show any aten::__ior__ operator. It seems that torch.jit.trace optimizes away the in-place operation.

The original conversion failure was explicitly due to __ior__. It appears that torch.jit.trace is not a suitable tool to capture this specific op.

I've attached the full graph output from the successful trace below.

--- Traced Model Graph ---
loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`.
.../lib/python3.12/site-packages/transformers/masking_utils.py:720: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if batch_size != position_ids.shape[0]:
.../lib/python3.12/site-packages/transformers/integrations/sdpa_attention.py:82: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  is_causal = query.shape[2] > 1 and attention_mask is None and getattr(module, "is_causal", True)
graph(%self.1 : __torch__.transformers.models.gemma3.modeling_gemma3.Gemma3ForCausalLM,
      %input_ids : Long(1, 8, strides=[8, 1], requires_grad=0, device=cpu)):
  %lm_head : __torch__.torch.nn.modules.linear.___torch_mangle_438.Linear = prim::GetAttr[name="lm_head"](%self.1)
  %model : __torch__.transformers.models.gemma3.modeling_gemma3.Gemma3TextModel = prim::GetAttr[name="model"](%self.1)
  %model.1 : __torch__.transformers.models.gemma3.modeling_gemma3.Gemma3TextModel = prim::GetAttr[name="model"](%self.1)
  %embed_tokens.1 : __torch__.transformers.models.gemma3.modeling_gemma3.Gemma3TextScaledWordEmbedding = prim::GetAttr[name="embed_tokens"](%model.1)
  %weight.3 : Tensor = prim::GetAttr[name="weight"](%embed_tokens.1)
  %16733 : Tensor = prim::CallMethod[name="forward"](%model, %input_ids)
  %14727 : int = prim::Constant[value=0]() # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %14728 : int = prim::Constant[value=0]() # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %14729 : int = prim::Constant[value=9223372036854775807]() # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %14730 : int = prim::Constant[value=1]() # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %14731 : Float(1, 8, 1152, strides=[9216, 1152, 1], requires_grad=1, device=cpu) = aten::slice(%16733, %14727, %14728, %14729, %14730) # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %14732 : int = prim::Constant[value=1]() # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %14733 : int = prim::Constant[value=0]() # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %14734 : int = prim::Constant[value=9223372036854775807]() # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %14735 : int = prim::Constant[value=1]() # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %14736 : Float(1, 8, 1152, strides=[9216, 1152, 1], requires_grad=1, device=cpu) = aten::slice(%14731, %14732, %14733, %14734, %14735) # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %14737 : int = prim::Constant[value=2]() # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %14738 : int = prim::Constant[value=0]() # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %14739 : int = prim::Constant[value=9223372036854775807]() # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %14740 : int = prim::Constant[value=1]() # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %input : Float(1, 8, 1152, strides=[9216, 1152, 1], requires_grad=1, device=cpu) = aten::slice(%14736, %14737, %14738, %14739, %14740) # .../lib/python3.12/site-packages/transformers/models/gemma3/modeling_gemma3.py:665:0
  %16734 : Tensor = prim::CallMethod[name="forward"](%lm_head, %weight.3, %input)
  %14744 : str = prim::Constant[value="logits"]() # .../lib/python3.12/site-packages/torch/jit/_trace.py:1074:0
  %14745 : Dict(str, Tensor) = prim::DictConstruct(%14744, %16734)
  return (%14745)
--- End of Traced Model Graph ---

I am not sure how to provide the aten op for __ior__ since it is not visible in the traced graph.

kamisori-daijin avatar Aug 12 '25 04:08 kamisori-daijin

We don't convert the PyTorch traced graph directly. We convert a lowered and optimized version of it.

See: https://github.com/apple/coremltools/blob/be33582e2fd62885f15d22338cac9f777ad8119f/coremltools/converters/mil/frontend/torch/internal_graph.py#L503

TobyRoseman avatar Aug 12 '25 19:08 TobyRoseman

Any suggestions on how to work around the issue with ior?

carcharodon-maximus avatar Oct 03 '25 10:10 carcharodon-maximus

You could consider splitting the model into smaller parts, but I don't think there's a reliable workaround I can recommend at this time.

kamisori-daijin avatar Oct 04 '25 03:10 kamisori-daijin