coremltools M1 PyTorch converter issue

🐞Describe the bug

PyTorch convert issue. The issue occurs when shape of the input tensor exceeds some threshold. For example, if the shape is [1, 20_000, 2], then converter doesn't work. But if you change the shape to [1, 10_000, 2] everything works.

Trace

RuntimeError: {
    NSLocalizedDescription = "Error in declaring network.";
}

To Reproduce

import torch
from torch import nn
import coremltools

class Model(nn.Module):
    def forward(self, x):
        # x.shape == [B, T, 2]
        
        x_1, x_2 = x[..., 0], x[..., 1]
        new_x = torch.stack((x_1 * torch.cos(x_1), x_2 * torch.sin(x_2)), dim=-1)
        return new_x

model = Model()
x = torch.randn(1, 20000, 2)
ts_model = torch.jit.trace(model, (x, ))

coreml_input = [
    coremltools.TensorType(
        name='x',
        dtype=x.numpy().dtype,
        shape=x.numpy().shape
    ),
]

coreml_model = coremltools.convert(
    ts_model,
    inputs=coreml_input
)
outout = coreml_model.predict({'x': x.numpy()})

System environment (please complete the following information):

coremltools version: 4.0
OS: MacOS
macOS version: 10.15.7
How you install python: anaconda
python version: 3.7

Feb 12 '21 12:02 markovka17

This works for me on macOS 11.3.

Feb 20 '21 00:02 TobyRoseman

Since this is now working on the current macOS, I'm going to close this issue.

Oct 22 '21 19:10 TobyRoseman

Since this is now working on the current macOS, I'm going to close this issue.

@TobyRoseman Not sure why this issue is closed........ It should be a bug for coremltools and it happens a lot on some network especially transformer-based network

Dec 08 '21 21:12 hezhangsprinter

@hezhangsprinter - As I said the original code now works in macOS 11.3. What OS are you using? Can you give me steps to reproduce the issue you are having?

Dec 08 '21 21:12 TobyRoseman

I can still reproduce this issue for tensor slicing, using both torch.jit.script or torch.jit.trace: Could you reopen it?

import torch
import coremltools


class Model(torch.nn.Module):
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        return x[:512]


model = Model()
x0 = torch.randn(512, dtype=torch.float32)
# ts_script = torch.jit.script(model)
ts_model = torch.jit.trace(model, [x0])

coreml_input = [
    coremltools.TensorType(
        name='inp_0',
        dtype=x0.numpy().dtype,
        shape=coremltools.Shape([
            coremltools.RangeDim(1, 512)])),
]

coreml_model = coremltools.convert(
    ts_model,
    inputs=coreml_input
)

Mar 07 '22 10:03 kpoeppel

@kpoeppel - The code you shared works fine for me. Are you on macOS 12? Are you using the most recent coremltools?

Mar 07 '22 22:03 TobyRoseman

I got the error on: coremltools 5.2.0 torch 1.9.1, 1.10.2

arch: arm64, x86_64 (Rosetta) Python: 3.9.6, 3.8.9 on a ARM64 (M1) Machine, MacOS 12.2.1 (all above version combinations).

Mar 08 '22 08:03 kpoeppel

I can reproduce this on an M1 machine.

Mar 08 '22 22:03 TobyRoseman

This looks like an issue with the CoreML Framework. I've created an internal issue.

Mar 08 '22 23:03 TobyRoseman

Work on the internal issue is on going.

The original code to reproduce the issue (posted by @markovka17) now works on an M1 machine with macOS 12.3

The second code to reproduce the issue (posted by @kpoeppel) is still failing. However there is a workaround: use default=512 when creating the coremltools.RangeDim.

Jul 13 '22 18:07 TobyRoseman

FWIW the second code posted by @kpoeppel works for me, I can get a result by calling:

coreml_model.predict({'inp_0': x0.numpy()})

I am on M1 macOS 13.3.1, Python 3.10, coremltools==6.3.0 and torch==2.0.0

May 12 '23 18:05 anentropic

@anentropic - thanks for the information. You are correct this has now been fixed.

May 12 '23 22:05 TobyRoseman