coremltools
coremltools copied to clipboard
Facing issues while converting a pre-trained PyTorch model to CoreML
Hello, I'm trying to convert a pre-trained model from PyTorch to CoreML. I have created a script to achieve the same. I'm able to load and convert the model to TorchScript from both of the methods. (i.e. Tracing and Scripting) However, when calling the coremltools.convert() method for the traced or scripted model it throws an error. I have mentioned the scripts for both methods along with errors thrown.
System Information
MacOS = 12.4 Python = 3.9 protobuf = 3.19.0 coremltools = 6.0b1 torch = 1.10.2 torchvision = 0.11.3
Note - I have tried with multiple versions of the libraries I have mentioned above but that does not help me in any way.
Method 1 -> Tracing
Code -
import coremltools as coremltools
import numpy as np
import torch
import torchvision as torchvision
def do_trace(in_model, in_input):
model_trace = torch.jit.trace(in_model, in_input)
model_trace.eval()
return model_trace
def dict_to_tuple(out_dict):
if "masks" in out_dict.keys():
return out_dict["boxes"], out_dict["scores"], out_dict["labels"], out_dict["masks"]
return out_dict["boxes"], out_dict["scores"], out_dict["labels"]
class PredictionModel(torch.nn.Module):
def __init__(self):
super().__init__()
self.model = torchvision.models.detection.keypointrcnn_resnet50_fpn(pretrained=True)
def forward(self, in_input):
output = self.model(in_input)
return dict_to_tuple(output[0])
inp = torch.Tensor(np.random.uniform(0.0, 250.0, size=(1, 3, 300, 300)))
model = PredictionModel().eval()
with torch.no_grad():
output = model(inp)
trace_model = do_trace(model, inp)
ml_model = coremltools.convert(trace_model, inputs=[coremltools.TensorType(shape=(1, 3, 300, 300))])
print(ml_model)
Error -
Converting PyTorch Frontend ==> MIL Ops: 3%|▎ | 74/2627 [00:00<00:05, 436.01 ops/s] Traceback (most recent call last): File "/Users/techlead/PycharmProjects/conversion_demo/venv/lib/python3.9/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 91, in _perform_torch_convert prog = converter.convert() File "/Users/techlead/PycharmProjects/conversion_demo/venv/lib/python3.9/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 263, in convert convert_nodes(self.context, self.graph) File "/Users/techlead/PycharmProjects/conversion_demo/venv/lib/python3.9/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 89, in convert_nodes add_op(context, node) File "/Users/techlead/PycharmProjects/conversion_demo/venv/lib/python3.9/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 3973, in reciprocal context.add(mb.inverse(x=inputs[0], name=node.name)) File "/Users/techlead/PycharmProjects/conversion_demo/venv/lib/python3.9/site-packages/coremltools/converters/mil/mil/ops/registry.py", line 63, in add_op return cls._add_op(op_cls, **kwargs) File "/Users/techlead/PycharmProjects/conversion_demo/venv/lib/python3.9/site-packages/coremltools/converters/mil/mil/builder.py", line 191, in _add_op new_op.type_value_inference() File "/Users/techlead/PycharmProjects/conversion_demo/venv/lib/python3.9/site-packages/coremltools/converters/mil/mil/operation.py", line 244, in type_value_inference output_vals = self._auto_val(output_types) File "/Users/techlead/PycharmProjects/conversion_demo/venv/lib/python3.9/site-packages/coremltools/converters/mil/mil/operation.py", line 354, in _auto_val builtin_val.val = v File "/Users/techlead/PycharmProjects/conversion_demo/venv/lib/python3.9/site-packages/coremltools/converters/mil/mil/types/type_tensor.py", line 93, in val raise ValueError( ValueError: tensor should have value of type ndarray, got <class 'numpy.float32'> instead
Method 2 -> Scripting
Code -
import coremltools as coremltools
import torch
import torchvision as torchvision
model = torchvision.models.detection.keypointrcnn_resnet50_fpn(pretrained=True)
script_model = torch.jit.script(model)
ml_model = coremltools.convert(script_model, inputs=[coremltools.TensorType(shape=(1, 3, 300, 300))])
print(ml_model)
Error -
WARNING:root:Support for converting Torch Script Models is experimental. If possible you should use a traced model for conversion.
Traceback (most recent call last):
File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/pydevd.py", line 1491, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/Users/techlead/PycharmProjects/conversion_demo/main.py", line 8, in
If you try to run any of the above snippets you'll see that the model gets successfully converted to TorchScipt (Trace and Script) but the last step i.e. to convert the torch script model to coreml fails. Please have a look at this issue and let me know how I can move further with this. Also, if I'm doing something wrong (for eg - Passing the inputs wrong) let me know as well in that case. This is me first time doing this, so i'm kind of a noob. Any help is appreciated. Thank you!
Your code to convert the traced model looks good to me. Please note that our support for PyTorch Scripting is experimental. So I would definitely recommend using traced models if you can.
I suspect this is a bug in coremltools. Can you get predictions from the traced PyTorch model? If so, it's almost certainly a bug with coremltools.
torchvision.models.detection.keypointrcnn_resnet50_fpn looks like a complicated model. Can you work to isolate the problem? Ideally we would have a simple toy network that reproduces the issue.
Hi, First of all thank you for your prompt reply. Coming to the topic, to answer your first question, Yes! I'm able to get the predictions for the traced model. So it might be an issue with the coremltools itself.
Adding one point to the discussion - Even if you change the torchvision.models.detection.keypointrcnn_resnet50_fpn model to torchvision.models.detection.fasterrcnn_resnet50_fpn the results are same.
Finally, Yes I'll try to come up with a small toy network to reproduce the issue but meanwhile can we find any workaround to convert the model?
Thank you!
Since we have not received steps to reproduce this problem, I'm going to close it. If we get steps to reproduce it, I'm happy to reopen.
@TobyRoseman Here's a small self-contained script to reproduce it. torch 1.13, coremltools 6.2.
import torch
import coremltools as ct
@torch.jit.script
def _padded_size(x):
"""Pytorch cannot trace x.shape[] code. We therefore create a little TorchScript function."""
padded_size = 2 ** torch.ceil(torch.log2(torch.tensor(x.size(-1) - 1).to(torch.float)))
return padded_size.int()
class MyModel(torch.nn.Module):
def __init__(self):
super().__init__()
def forward(self, x):
return torch.zeros((1, _padded_size(x)))
model = MyModel().eval()
with torch.inference_mode():
example_input = torch.rand(10, 1000)
traced_model = torch.jit.trace(model, example_input)
coreml_model = ct.convert(
traced_model,
convert_to='neuralnetwork', # or 'mlprogram', same result
inputs=[ct.TensorType(shape=example_input.shape, name='x')],
debug=True
)
This line is the crux:
padded_size = 2 ** torch.ceil(torch.log2(torch.tensor(x.size(-1) - 1).to(torch.float)))
Without the .to(...) statement, torch tries to take a log2 from an integer and fails. I've also tried torch.float32, but that fails with the same error as torch.float.
Thank you for the nice code snippet to reproduce the issue. I can reproduce it as well.
I see that there are constants in the torch graph which are of size 0 or empty shape, that are causing this error in the coremltools code. I'm reopening this issue, as it needs further investigation.
I actually managed to find a workaround: if I replace .to(torch.float) with .type_as(x), it works fine :).
I'm having the same issue as @Prasad-Techlead mentions. I'm able to script and trace Torchvision models and get correct predictions, but when converting to CoreML, I get the error images.7 defined in (%images.7 : __torch__.torchvision.models.detection.image_list.ImageList, %targets.31 : Dict(str, Tensor)[]? = prim::TupleUnpack(%396)\n)",
I've tried multiple networks: retinanet_resnet50_fpn_v2, fasterrcnn_resnet50_fpn_v2 and ssdlite320_mobilenet_v3_large.
Has anyone managed to convert any of Torchvision's object detection models to CoreML?
Using tip of main, the code still does not work but the error message and stack trace has changed:
ValueError: tensor should have value of type ndarray, got instead
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[1], line 24
21 example_input = torch.rand(10, 1000)
22 traced_model = torch.jit.trace(model, example_input)
---> 24 coreml_model = ct.convert(
25 traced_model,
26 convert_to='neuralnetwork', # or 'mlprogram', same result
27 inputs=[ct.TensorType(shape=example_input.shape, name='x')],
28 debug=True
29 )
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/_converters_entry.py:530, in convert(model, source, inputs, outputs, classifier_config, minimum_deployment_target, convert_to, compute_precision, skip_model_load, compute_units, package_dir, debug, pass_pipeline)
527 if specification_version is None:
528 specification_version = _set_default_specification_version(exact_target)
--> 530 mlmodel = mil_convert(
531 model,
532 convert_from=exact_source,
533 convert_to=exact_target,
534 inputs=inputs,
535 outputs=outputs_as_tensor_or_image_types, # None or list[ct.ImageType/ct.TensorType]
536 classifier_config=classifier_config,
537 skip_model_load=skip_model_load,
538 compute_units=compute_units,
539 package_dir=package_dir,
540 debug=debug,
541 specification_version=specification_version,
542 main_pipeline=pass_pipeline,
543 )
545 if exact_target == "mlprogram" and mlmodel._input_has_infinite_upper_bound():
546 raise ValueError(
547 "For mlprogram, inputs with infinite upper_bound is not allowed. Please set upper_bound"
548 ' to a positive value in "RangeDim()" for the "inputs" param in ct.convert().'
549 )
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/converter.py:188, in mil_convert(model, convert_from, convert_to, compute_units, **kwargs)
149 @_profile
150 def mil_convert(
151 model,
(...)
155 **kwargs
156 ):
157 """
158 Convert model from a specified frontend `convert_from` to a specified
159 converter backend `convert_to`.
(...)
186 See `coremltools.converters.convert`
187 """
--> 188 return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/converter.py:212, in _mil_convert(model, convert_from, convert_to, registry, modelClass, compute_units, **kwargs)
209 weights_dir = _tempfile.TemporaryDirectory()
210 kwargs["weights_dir"] = weights_dir.name
--> 212 proto, mil_program = mil_convert_to_proto(
213 model,
214 convert_from,
215 convert_to,
216 registry,
217 **kwargs
218 )
220 _reset_conversion_state()
222 if convert_to == 'milinternal':
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/converter.py:286, in mil_convert_to_proto(model, convert_from, convert_to, converter_registry, main_pipeline, **kwargs)
281 frontend_pipeline, backend_pipeline = _construct_other_pipelines(
282 main_pipeline, convert_from, convert_to
283 )
285 frontend_converter = frontend_converter_type()
--> 286 prog = frontend_converter(model, **kwargs)
287 PassPipelineManager.apply_pipeline(prog, frontend_pipeline)
289 PassPipelineManager.apply_pipeline(prog, main_pipeline)
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/converter.py:108, in TorchFrontend.__call__(self, *args, **kwargs)
105 def __call__(self, *args, **kwargs):
106 from .frontend.torch.load import load
--> 108 return load(*args, **kwargs)
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/torch/load.py:63, in load(model_spec, inputs, specification_version, debug, outputs, cut_at_symbols, **kwargs)
55 inputs = _convert_to_torch_inputtype(inputs)
56 converter = TorchConverter(
57 torchscript,
58 inputs,
(...)
61 specification_version,
62 )
---> 63 return _perform_torch_convert(converter, debug)
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/torch/load.py:102, in _perform_torch_convert(converter, debug)
100 def _perform_torch_convert(converter, debug):
101 try:
--> 102 prog = converter.convert()
103 except RuntimeError as e:
104 if debug and "convert function" in str(e):
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/torch/converter.py:439, in TorchConverter.convert(self)
436 self.convert_const()
438 # Add the rest of the operations
--> 439 convert_nodes(self.context, self.graph)
441 graph_outputs = [self.context[name] for name in self.graph.outputs]
443 # An output can be None when it's a None constant, which happens
444 # in Fairseq MT.
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/torch/ops.py:92, in convert_nodes(context, graph)
87 raise RuntimeError(
88 f"PyTorch convert function for op '{node.kind}' not implemented."
89 )
91 context.prepare_for_conversion(node)
---> 92 add_op(context, node)
94 # We've generated all the outputs the graph needs, terminate conversion.
95 if _all_outputs_present(context, graph):
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/torch/ops.py:5283, in log2(context, node)
5281 inputs = _get_inputs(context, node)
5282 x = inputs[0]
-> 5283 log_x = mb.log(x=x)
5284 context.add(mb.mul(x=log_x, y=1 / _np.log(2.0)), node.name)
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/mil/ops/registry.py:183, in SSAOpRegistry.register_op.<locals>.class_wrapper.<locals>.add_op(cls, **kwargs)
180 else:
181 op_cls_to_add = op_reg[op_type]
--> 183 return cls._add_op(op_cls_to_add, **kwargs)
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/mil/builder.py:182, in Builder._add_op(cls, op_cls, **kwargs)
180 curr_block()._insert_op_before(new_op, before_op=before_op)
181 new_op.build_nested_blocks()
--> 182 new_op.type_value_inference()
183 if len(new_op.outputs) == 1:
184 return new_op.outputs[0]
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/mil/operation.py:256, in Operation.type_value_inference(self, overwrite_output)
254 if not isinstance(output_types, tuple):
255 output_types = (output_types,)
--> 256 output_vals = self._auto_val(output_types)
257 try:
258 output_names = self.output_names()
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/mil/operation.py:396, in Operation._auto_val(self, output_types)
394 builtin_val.val = v.ls
395 else:
--> 396 builtin_val.val = v
397 auto_val.append(builtin_val)
398 return auto_val
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/mil/types/type_tensor.py:88, in tensor.<locals>.tensor.val(self, v)
85 @val.setter
86 def val(self, v):
87 if not isinstance(v, np.ndarray):
---> 88 raise ValueError(
89 "tensor should have value of type ndarray, got {} instead".format(
90 type(v)
91 )
92 )
94 v_type = numpy_type_to_builtin_type(v.dtype)
95 promoted_type = promote_types(v_type, primitive)
ValueError: tensor should have value of type ndarray, got <class 'numpy.float32'> instead