coremltools
coremltools copied to clipboard
Can't figure out NSLocalizedDescription = "Error in declaring network.";
I have two PyTorch models, one text Transformer and one image Transformer. For both models I used ct.TensorType
as input.
Both models were converted successfully on the following platform:
- Ubuntu 22.04 LTS
- Python 3.9.13
- PyTorch 1.10.1
- coremltools 6.02b
Now, I am trying to import the CoreML models on MacBook Pro 2021 M1
- macOS 12.6
- Python3.9.13
- PyTorch 1.12.1
- coremltools 6.02b
Text Transformer works fine and produces identical results to its PyTorch counterpart.
Image Transformer model is giving me the NSLocalizedDescription = "Error in declaring network.";
error on load. I can't figure out what's the issue.
I know that PyTorch 1.12.1 is not supported, but I am just doing inference with the already converted model, and PyTorch is not being used in this case.
All tensors are less than rank 5.
This is the Image Transformer model:
def forward(self, x: torch.Tensor):
x = self.conv1(x) # shape = [*, width, grid, grid]
x = x.reshape(x.shape[0], x.shape[1], -1) # shape = [*, width, grid ** 2]
x = x.permute(0, 2, 1) # shape = [*, grid ** 2, width]
x = torch.cat(
[self.class_embedding.to(x.dtype) + torch.zeros(x.shape[0], 1, x.shape[-1], dtype=x.dtype, device=x.device),
x], dim=1) # shape = [*, grid ** 2 + 1, width]
x = x + self.positional_embedding.to(x.dtype)
x = self.ln_pre(x)
x = x.permute(1, 0, 2) # NLD -> LND
x = self.transformer(x)
x = x.permute(1, 0, 2) # LND -> NLD
x = self.ln_post(x[:, 0, :])
if self.proj is not None:
x = x @ self.proj
return x
where self.transformer
is this:
class Transformer(nn.Module):
def __init__(self, width: int, layers: int, heads: int, mlp_ratio: float = 4.0, act_layer: Callable = nn.GELU):
super().__init__()
self.width = width
self.layers = layers
self.grad_checkpointing = False
self.resblocks = nn.ModuleList([
ResidualAttentionBlock(width, heads, mlp_ratio, act_layer=act_layer)
for _ in range(layers)
])
def forward(self, x: torch.Tensor, attn_mask: Optional[torch.Tensor] = None):
for r in self.resblocks:
if self.grad_checkpointing and not torch.jit.is_scripting():
x = checkpoint(r, x, attn_mask)
else:
x = r(x, attn_mask=attn_mask)
return x
There is also a big difference in execution time between PyTorch and CoreML text models.
PyTorch: 0.27843 seconds CoreML: 7.80541 seconds
We just released coremltools 6.0. This version does support PyTorch 1.12.1.
Are you converting to neuralnetwork
or mlprogram
? If you're converting to neuralnetwork
or not specifying that value, try using mlprogram
, i.e. add convert_to='mlprogram'
to your coremltools.convert
call.
If you're still getting this error when using both coremltools 6.0 and mlprogram
, please share complete details to reproduce the issue.
There is also a big difference in execution time between PyTorch and CoreML text models.
PyTorch: 0.27843 seconds CoreML: 7.80541 seconds
Are you measuring performance on the same machine with the same device (i.e. CPU vs GPU)? If you think there is a problem here, please open a new issue. Please include complete steps to reproduce the conversion, as well as complete step to get predictions from both models.
We just released coremltools 6.0. This version does support PyTorch 1.12.1.
Are you converting to
neuralnetwork
ormlprogram
? If you're converting toneuralnetwork
or not specifying that value, try usingmlprogram
, i.e. addconvert_to='mlprogram'
to yourcoremltools.convert
call.If you're still getting this error when using both coremltools 6.0 and
mlprogram
, please share complete details to reproduce the issue.
I was converting to mlprogram
. Let me try coremltools 6.0 and get back to you.
Are you measuring performance on the same machine with the same device (i.e. CPU vs GPU)? If you think there is a problem here, please open a new issue. Please include complete steps to reproduce the conversion, as well as complete step to get predictions from both models.
Yes, I am measuring the performance on the same machine, MacBook Pro M1. I’ll check if anything changes with new coremltools.
This is the result after using coremltools 6.0
:
Machine running the conversion: MacBook Pro 2021 M1
- macOS 12.6
- Python3.9.13
- PyTorch 1.12.1
- coremltools 6
Conversion commands
# Image Transformer
image_encoder_model = ct.convert(
traced_image_encoder,
convert_to="mlprogram",
compute_precision=ct.precision.FLOAT32,
inputs=[ct.TensorType(shape=example_image.shape, name="image")]
)
# Text Transformer
text_encoder_model = ct.convert(
traced_text_encoder,
convert_to="mlprogram",
compute_precision=ct.precision.FLOAT32,
inputs=[ct.TensorType(shape=example_text.shape, name="text")]
)
Output of the conversion process for the Image Transformer:
Converting PyTorch Frontend ==> MIL Ops: 100%|██████████████████████████████████████████████████████▉| 2035/2036 [00:00<00:00, 4953.12 ops/s]
Running MIL Common passes: 5%|███▊ | 2/38 [00:00<00:04, 7.33 passes/s]/Users/marko/miniconda3/envs/pytorch_arm/lib/python3.9/site-packages/coremltools/converters/mil/mil/passes/name_sanitization_utils.py:129: UserWarning: Output, '2818', of the source model, has been renamed to 'var_2818' in the Core ML model.
warnings.warn(msg.format(var.name, new_name))
Running MIL Common passes: 100%|████████████████████████████████████████████████████████████████████████| 38/38 [00:01<00:00, 25.72 passes/s]
Running MIL Clean up passes: 100%|██████████████████████████████████████████████████████████████████████| 11/11 [00:01<00:00, 10.25 passes/s]
/Users/marko/miniconda3/envs/pytorch_arm/lib/python3.9/site-packages/coremltools/models/model.py:156: RuntimeWarning: You will not be able to run predict() on this Core ML model. Underlying exception message was: {
NSLocalizedDescription = "Error in declaring network.";
}
_warnings.warn(
Output of the conversion process for the Text Transformer:
Converting PyTorch Frontend ==> MIL Ops: 4%|██▍ | 66/1565 [00:00<00:00, 7362.73 ops/s]
Traceback (most recent call last):
File "/Users/marko/code-snippets/python/convert-openclip-coreml.py", line 70, in <module>
convert_openclip_coreml()
... #truncated error stack for brevity
File "/Users/marko/miniconda3/envs/pytorch_arm/lib/python3.9/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 92, in convert_nodes
raise RuntimeError(
RuntimeError: PyTorch convert function for op 'baddbmm' not implemented.
Summary:
- Image Transformer: model was converted, but I got the same error message as before just this time thrown during/after the conversion process.
- Text Transformer: model was not converted due to unsupported PyTorch operation. However,
coremltools6.02b
was able to convert this model before, so I suspect you removed the support forbaddbmm
operation in the stable 6.0 version.
We did not remove support for torch.baddbmm
. We never supported it. We're already tracking that issue in #1555. It seems the new version of PyTorch uses this op more often than before.
In order to make progress with the image transformer, we need to be able to reproduce the problem. Ideally we'd have a minimal example to reproduce the issue. Can you share standalone code to reproduce your Image Transformer conversion issue?
We did not remove support for
torch.baddbmm
. We never supported it. We're already tracking that issue in #1555. It seems the new version of PyTorch uses this op more often than before.
That is strange. As you can see from my first message, on the coremltools6.02b
I was able to convert that Text Transformer, use it for inference, and validate the correctness with the PyTorch model. After upgrading to coremltools6
that conversion now fails, but I haven't changed the models at all.
In order to make progress with the image transformer, we need to be able to reproduce the problem. Ideally we'd have a minimal example to reproduce the issue. Can you share standalone code to reproduce your Image Transformer conversion issue?
I'll post the code a bit later.
Since we have not heard back here, I'm going to close this issue. If we get the code to reproduce this issue, I'll reopen it.