coremltools icon indicating copy to clipboard operation
coremltools copied to clipboard

M1 PyTorch converter issue

Open markovka17 opened this issue 4 years ago • 10 comments

🐞Describe the bug

PyTorch convert issue. The issue occurs when shape of the input tensor exceeds some threshold. For example, if the shape is [1, 20_000, 2], then converter doesn't work. But if you change the shape to [1, 10_000, 2] everything works.

Trace

RuntimeError: {
    NSLocalizedDescription = "Error in declaring network.";
}

To Reproduce

import torch
from torch import nn
import coremltools

class Model(nn.Module):
    def forward(self, x):
        # x.shape == [B, T, 2]
        
        x_1, x_2 = x[..., 0], x[..., 1]
        new_x = torch.stack((x_1 * torch.cos(x_1), x_2 * torch.sin(x_2)), dim=-1)
        return new_x

model = Model()
x = torch.randn(1, 20000, 2)
ts_model = torch.jit.trace(model, (x, ))

coreml_input = [
    coremltools.TensorType(
        name='x',
        dtype=x.numpy().dtype,
        shape=x.numpy().shape
    ),
]

coreml_model = coremltools.convert(
    ts_model,
    inputs=coreml_input
)
outout = coreml_model.predict({'x': x.numpy()})

System environment (please complete the following information):

  • coremltools version: 4.0
  • OS: MacOS
  • macOS version: 10.15.7
  • How you install python: anaconda
  • python version: 3.7

markovka17 avatar Feb 12 '21 12:02 markovka17

This works for me on macOS 11.3.

TobyRoseman avatar Feb 20 '21 00:02 TobyRoseman

Since this is now working on the current macOS, I'm going to close this issue.

TobyRoseman avatar Oct 22 '21 19:10 TobyRoseman

Since this is now working on the current macOS, I'm going to close this issue.

@TobyRoseman Not sure why this issue is closed........ It should be a bug for coremltools and it happens a lot on some network especially transformer-based network

hezhangsprinter avatar Dec 08 '21 21:12 hezhangsprinter

@hezhangsprinter - As I said the original code now works in macOS 11.3. What OS are you using? Can you give me steps to reproduce the issue you are having?

TobyRoseman avatar Dec 08 '21 21:12 TobyRoseman

I can still reproduce this issue for tensor slicing, using both torch.jit.script or torch.jit.trace: Could you reopen it?

import torch
import coremltools


class Model(torch.nn.Module):
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        return x[:512]


model = Model()
x0 = torch.randn(512, dtype=torch.float32)
# ts_script = torch.jit.script(model)
ts_model = torch.jit.trace(model, [x0])

coreml_input = [
    coremltools.TensorType(
        name='inp_0',
        dtype=x0.numpy().dtype,
        shape=coremltools.Shape([
            coremltools.RangeDim(1, 512)])),
]

coreml_model = coremltools.convert(
    ts_model,
    inputs=coreml_input
)

kpoeppel avatar Mar 07 '22 10:03 kpoeppel

@kpoeppel - The code you shared works fine for me. Are you on macOS 12? Are you using the most recent coremltools?

TobyRoseman avatar Mar 07 '22 22:03 TobyRoseman

I got the error on: coremltools 5.2.0 torch 1.9.1, 1.10.2

arch: arm64, x86_64 (Rosetta) Python: 3.9.6, 3.8.9 on a ARM64 (M1) Machine, MacOS 12.2.1 (all above version combinations).

kpoeppel avatar Mar 08 '22 08:03 kpoeppel

I can reproduce this on an M1 machine.

TobyRoseman avatar Mar 08 '22 22:03 TobyRoseman

This looks like an issue with the CoreML Framework. I've created an internal issue.

TobyRoseman avatar Mar 08 '22 23:03 TobyRoseman

Work on the internal issue is on going.

The original code to reproduce the issue (posted by @markovka17) now works on an M1 machine with macOS 12.3

The second code to reproduce the issue (posted by @kpoeppel) is still failing. However there is a workaround: use default=512 when creating the coremltools.RangeDim.

TobyRoseman avatar Jul 13 '22 18:07 TobyRoseman

FWIW the second code posted by @kpoeppel works for me, I can get a result by calling:

coreml_model.predict({'inp_0': x0.numpy()})

I am on M1 macOS 13.3.1, Python 3.10, coremltools==6.3.0 and torch==2.0.0

anentropic avatar May 12 '23 18:05 anentropic

@anentropic - thanks for the information. You are correct this has now been fixed.

TobyRoseman avatar May 12 '23 22:05 TobyRoseman