CoreML segfaults on torch.nn.Conv1d
🐞Describing the bug
CoreML segfaults when running torch.ops.aten.conv1d.default.
To Reproduce
import torch
class Model(torch.nn.Module):
def __init__(self):
super().__init__()
self.conv = torch.nn.Conv1d(16, 4, 6, stride=8, padding=0, dilation=2, groups=2, bias=False)
def forward(self, x):
return self.conv(x)
model = Model()
inputs = (
torch.randn(2, 16, 11),
)
eager_outputs = model(*inputs)
ep = torch.export.export(model, inputs)
import coremltools as ct
import numpy as np
ep = ep.run_decompositions({})
eager_outputs = model(*inputs)
mlmodel = ct.convert(ep)
coreml_inputs = mlmodel.get_spec().description.input
coreml_outputs = mlmodel.get_spec().description.output
predict_inputs = {str(ct_in.name): pt_in.detach().cpu().numpy().astype(np.int32) for ct_in, pt_in in zip(coreml_inputs, inputs)}
out = mlmodel.predict(predict_inputs)
print("Eager", eager_outputs)
print("CoremL", out)
This above code results in a segfault.
System environment (please complete the following information):
- coremltools version: 8.3
- OS (e.g. MacOS version or Linux type): macOS15
Reproduced on coremltools 9.0b1, macos 15.6, torch 2.7.1.
Also reproduced on simplified a conv: self.conv = torch.nn.Conv1d(16, 2, 3, dilation=2, groups=2)
(changed to a more reasonable odd kernel size, no stride or padding involved)
It can also be reproduced using torch.jit.trace instead of torch.export.export
It requires both dilation and groups to segfault, removing just one no longer segfaults.
@metascroy This is a Core ML framework crash and we are tracking it internally, the model should not crash on CPU_ONLY and CPU_AND_GPU compute units.
Hey @cymbalrush,
I have been debugging this issue and found something that might help:
Crash Details:
- Error:
Error: # of batch has to be 1 (FILE: ZinNEConvLayer.cpp, LINE: 1155) - Crash location: libsystem_platform.dylib`_platform_memmove + 144
- The segfault occurs specifically in the ANE compiler when batch size > 1
Here's the backtrace.txt of the run.
Minimal Reproduction:
# With the previously provided python script
# and versions: coremltools==9.0b1, torch==2.6.0 (MacOS 15.5)
# Run the debugger
lldb python
# after activation
(lldb) run main.py # or any py file having the code to reproduce
It seems there's a bug in the ANE compiler's batch size validation code. It detects the unsupported configuration but then crashes during memory operations instead of falling back gracefully. This confirms why setting compute_units=CPU_ONLY works as avoids the ANE compiler path entirely.
Also, I am interested in helping resolve this issue. However, I understand that the issue is being handled internally - perhaps can I assist by testing the patches (or maybe provide more debugging info) or help with fixing if any parts of this is open-source?