coremltools
coremltools copied to clipboard
Crash converting the text encoder component of CLIP-H
🐞Describing the bug
When converting the text encoder component of LAION's CLIP-H model to CoreML using a variable input shape, ct.convert crashes Python.
Stack Trace
...
>>> # Convert traced model to CoreML
>>> text_input_shape = ct.Shape(shape=(1,
... ct.RangeDim(lower_bound=2, upper_bound=77, default=77)))
>>>
>>> model_coreml = ct.convert(
... model_traced,
... inputs=[ct.TensorType(name="input_text_token_ids", shape=text_input_shape, dtype=np.int64)],
... outputs=[ct.TensorType(name="output_embedding", dtype=np.float16)],
... minimum_deployment_target=ct.target.macOS13,
... convert_to='mlprogram'
... )
Converting PyTorch Frontend ==> MIL Ops: 0% 0/1510 [00:00<?, ? ops/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Converting PyTorch Frontend ==> MIL Ops: 96% 1448/1510 [00:00<00:00, 1616.88 ops/s]Saving value type of int64 into a builtin type of int32, might lose precision!
Converting PyTorch Frontend ==> MIL Ops: 100% 1509/1510 [00:00<00:00, 1730.12 ops/s]
Running MIL frontend_pytorch pipeline: 100% 5/5 [00:00<00:00, 120.13 passes/s]
Running MIL default pipeline: 100% 66/66 [00:19<00:00, 3.44 passes/s]
Running MIL backend_mlprogram pipeline: 100% 11/11 [00:00<00:00, 228.89 passes/s]
Process 42442 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGKILL
frame #0: 0x000000019a2663b0 libsystem_platform.dylib`__bzero + 64
libsystem_platform.dylib`:
-> 0x19a2663b0 <+64>: dc zva, x3
0x19a2663b4 <+68>: add x3, x3, #0x40
0x19a2663b8 <+72>: subs x2, x2, #0x40
0x19a2663bc <+76>: b.hi 0x19a2663b0 ; <+64>
Target 0: (python) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGKILL
* frame #0: 0x000000019a2663b0 libsystem_platform.dylib`__bzero + 64
frame #1: 0x00000001b0e2d588 Espresso`std::__1::__shared_ptr_emplace<Espresso::blob<float, 4>, std::__1::allocator<Espresso::blob<float, 4> > >::__shared_ptr_emplace[abi:v15006]<int&, int&, int&, int&>(std::__1::allocator<Espresso::blob<float, 4> >, int&, int&, int&, int&) + 156
frame #2: 0x00000001b0e2d4bc Espresso`std::__1::shared_ptr<Espresso::blob<float, 4> > std::__1::allocate_shared[abi:v15006]<Espresso::blob<float, 4>, std::__1::allocator<Espresso::blob<float, 4> >, int&, int&, int&, int&, void>(std::__1::allocator<Espresso::blob<float, 4> > const&, int&, int&, int&, int&) + 76
frame #3: 0x00000001b1268eb4 Espresso`Espresso::blob_cpu::resize(Espresso::layer_shape const&, std::__1::shared_ptr<Espresso::abstract_blob_container_options>) + 1108
frame #4: 0x00000001b0ecde48 Espresso`Espresso::allocate_blobs(std::__1::unordered_map<std::__1::shared_ptr<Espresso::abstract_blob_container>, int, std::__1::hash<std::__1::shared_ptr<Espresso::abstract_blob_container> >, std::__1::equal_to<std::__1::shared_ptr<Espresso::abstract_blob_container> >, std::__1::allocator<std::__1::pair<std::__1::shared_ptr<Espresso::abstract_blob_container> const, int> > > const&, std::__1::unordered_map<std::__1::shared_ptr<Espresso::abstract_blob_container>, unsigned long, std::__1::hash<std::__1::shared_ptr<Espresso::abstract_blob_container> >, std::__1::equal_to<std::__1::shared_ptr<Espresso::abstract_blob_container> >, std::__1::allocator<std::__1::pair<std::__1::shared_ptr<Espresso::abstract_blob_container> const, unsigned long> > >&, std::__1::unordered_map<std::__1::shared_ptr<Espresso::abstract_blob_container>, Espresso::layer_shape, std::__1::hash<std::__1::shared_ptr<Espresso::abstract_blob_container> >, std::__1::equal_to<std::__1::shared_ptr<Espresso::abstract_blob_container> >, std::__1::allocator<std::__1::pair<std::__1::shared_ptr<Espresso::abstract_blob_container> const, Espresso::layer_shape> > >&, int) + 732
frame #5: 0x00000001b0ec8c60 Espresso`Espresso::reshape_networks_graph_coloring_raw_ptr_only_in_context(std::__1::shared_ptr<Espresso::abstract_context> const&, std::__1::vector<Espresso::net*, std::__1::allocator<Espresso::net*> > const&, int) + 2180
frame #6: 0x00000001b0ec8340 Espresso`Espresso::reshape_networks_graph_coloring_raw_ptr(std::__1::vector<Espresso::net*, std::__1::allocator<Espresso::net*> >, int) + 640
frame #7: 0x00000001b0ec6750 Espresso`Espresso::pass_graph_coloring::run_on_network(Espresso::net&) + 272
frame #8: 0x00000001b0ffcbc4 Espresso`Espresso::shape_network_recursive(Espresso::net*, Espresso::network_shape const&, int, bool) + 6952
frame #9: 0x00000001b107bd48 Espresso`Espresso::load_and_shape_network(std::__1::shared_ptr<Espresso::SerDes::generic_serdes_object> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<Espresso::abstract_context> const&, Espresso::network_shape const&, Espresso::compute_path, std::__1::shared_ptr<Espresso::blob_storage_abstract> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 644
frame #10: 0x00000001b128af7c Espresso`Espresso::reload_network_on_context(std::__1::shared_ptr<Espresso::net> const&, std::__1::shared_ptr<Espresso::abstract_context> const&, Espresso::compute_path) + 452
frame #11: 0x00000001b107d750 Espresso`Espresso::load_network(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<Espresso::abstract_context> const&, Espresso::compute_path, bool) + 1504
frame #12: 0x00000001b0f1e7bc Espresso`EspressoLight::espresso_plan::add_network(char const*, espresso_storage_type_t, std::__1::shared_ptr<Espresso::net>) + 3108
frame #13: 0x00000001b0f31918 Espresso`EspressoLight::espresso_plan::add_network(char const*, espresso_storage_type_t) + 64
frame #14: 0x00000001b0f34c18 Espresso`espresso_plan_add_network + 416
frame #15: 0x00000001a26cdf68 CoreML`-[MLNeuralNetworkEngine _addNetworkToPlan:error:] + 232
frame #16: 0x00000001a26ccee4 CoreML`-[MLNeuralNetworkEngine _setupContextAndPlanWithConfiguration:usingCPU:reshapeWithContainer:error:] + 896
frame #17: 0x00000001a26ce808 CoreML`-[MLNeuralNetworkEngine initWithContainer:configuration:error:] + 200
frame #18: 0x00000001a273704c CoreML`-[MLMultiFunctionProgramEngine initWithProgramContainer:configuration:error:] + 312
frame #19: 0x00000001a2737248 CoreML`+[MLMultiFunctionProgramEngine loadModelFromCompiledArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] + 180
frame #20: 0x00000001a272e360 CoreML`+[MLLoader loadModelWithClass:fromArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] + 140
frame #21: 0x00000001a272c9ac CoreML`+[MLLoader loadModelFromArchive:configuration:loaderEvent:useUpdatableModelLoaders:error:] + 1952
frame #22: 0x00000001a272da14 CoreML`+[MLLoader loadModelFromArchive:configuration:loaderEvent:error:] + 24
frame #23: 0x00000001a272ed60 CoreML`+[MLLoader loadModelFromAssetAtURL:configuration:loaderEvent:error:] + 252
frame #24: 0x00000001a272efa0 CoreML`+[MLLoader loadModelFromAssetAtURL:configuration:error:] + 112
frame #25: 0x00000001a2711c44 CoreML`-[MLModelAsset load:] + 496
frame #26: 0x00000001a2711950 CoreML`-[MLModelAsset modelWithError:] + 60
frame #27: 0x00000001a276c7e8 CoreML`+[MLModel modelWithContentsOfURL:configuration:error:] + 188
frame #28: 0x00000002aaf96070 libcoremlpython.so`___lldb_unnamed_symbol353 + 692
frame #29: 0x00000002aafaa5a8 libcoremlpython.so`___lldb_unnamed_symbol605 + 148
frame #30: 0x00000002aafaa508 libcoremlpython.so`___lldb_unnamed_symbol604 + 24
frame #31: 0x00000002aafa00e8 libcoremlpython.so`___lldb_unnamed_symbol490 + 4724
frame #32: 0x00000001000acc88 python`cfunction_call + 80
frame #33: 0x000000010005a294 python`_PyObject_MakeTpCall + 612
frame #34: 0x000000010005db14 python`method_vectorcall + 620
frame #35: 0x00000001000d157c python`slot_tp_init + 140
frame #36: 0x00000001000c9e98 python`type_call + 340
frame #37: 0x000000012f3c98cc _pywrap_cpu_feature_guard.so`pybind11_meta_call + 40
frame #38: 0x000000010005a294 python`_PyObject_MakeTpCall + 612
frame #39: 0x00000001001490f0 python`call_function + 676
frame #40: 0x0000000100144e58 python`_PyEval_EvalFrameDefault + 26500
frame #41: 0x000000010013ddc8 python`_PyEval_Vector + 2056
frame #42: 0x0000000100149058 python`call_function + 524
frame #43: 0x0000000100144ec8 python`_PyEval_EvalFrameDefault + 26612
frame #44: 0x000000010013ddc8 python`_PyEval_Vector + 2056
frame #45: 0x000000010005a4a4 python`_PyObject_FastCallDictTstate + 156
frame #46: 0x000000010005b140 python`_PyObject_Call_Prepend + 164
frame #47: 0x00000001000d1564 python`slot_tp_init + 116
frame #48: 0x00000001000c9e98 python`type_call + 340
frame #49: 0x000000010005a294 python`_PyObject_MakeTpCall + 612
frame #50: 0x00000001001490f0 python`call_function + 676
frame #51: 0x0000000100144ec8 python`_PyEval_EvalFrameDefault + 26612
frame #52: 0x000000010013ddc8 python`_PyEval_Vector + 2056
frame #53: 0x000000010005aad8 python`PyVectorcall_Call + 156
frame #54: 0x0000000100145160 python`_PyEval_EvalFrameDefault + 27276
frame #55: 0x000000010013ddc8 python`_PyEval_Vector + 2056
frame #56: 0x0000000100149058 python`call_function + 524
frame #57: 0x0000000100144ec8 python`_PyEval_EvalFrameDefault + 26612
frame #58: 0x000000010013ddc8 python`_PyEval_Vector + 2056
frame #59: 0x0000000100149058 python`call_function + 524
frame #60: 0x0000000100144ec8 python`_PyEval_EvalFrameDefault + 26612
frame #61: 0x000000010013ddc8 python`_PyEval_Vector + 2056
frame #62: 0x0000000100198f98 python`run_mod + 216
frame #63: 0x0000000100199600 python`PyRun_InteractiveOneObjectEx + 944
frame #64: 0x0000000100198430 python`_PyRun_InteractiveLoopObject + 428
frame #65: 0x0000000100197990 python`_PyRun_AnyFileObject + 112
frame #66: 0x000000010019b754 python`PyRun_AnyFileExFlags + 184
frame #67: 0x00000001001bcc20 python`Py_RunMain + 2736
frame #68: 0x00000001001bdc50 python`pymain_main + 1272
frame #69: 0x000000010000400c python`main + 56
frame #70: 0x0000000199edff28 dyld`start + 2236
To Reproduce
from transformers import CLIPProcessor, CLIPModel
import torch
class WrappedCLIPModel_Text(CLIPModel):
def forward(self, *args, **kwargs):
return self.get_text_features(*args, **kwargs)
model_version = "laion/CLIP-ViT-H-14-laion2B-s32B-b79K"
processor = CLIPProcessor.from_pretrained(model_version)
model_pt_text = WrappedCLIPModel_Text.from_pretrained(model_version, return_dict=True)
model_pt_text.eval()
with torch.no_grad():
processed_text = processor(text="example text", images=None, return_tensors="pt", padding=True)
model_traced = torch.jit.trace(model_pt_text, processed_text.input_ids, strict=True)
import coremltools as ct
import numpy as np
# Convert traced model to CoreML
text_input_shape = ct.Shape(shape=(1,
ct.RangeDim(lower_bound=2, upper_bound=77, default=77)))
#text_input_shape = ct.Shape(shape=(1,77)) # ← no crash with this
model_coreml = ct.convert(
model_traced,
inputs=[ct.TensorType(name="input_text_token_ids", shape=text_input_shape, dtype=np.int64)],
outputs=[ct.TensorType(name="output_embedding", dtype=np.float16)],
minimum_deployment_target=ct.target.macOS13,
convert_to='mlprogram'
)
System environment (please complete the following information):
- coremltools version: 7.0b2
- OS (e.g. MacOS version or Linux type): macOS 13.3.1 (a)
- Any other relevant version information (e.g. PyTorch or TensorFlow version): torch==2.0.0 (also tried 2.1.0 dev)
Additional context
If the input shape is fixed (text_input_shape = ct.Shape(shape=(1,77))), the conversion is successful.
model_traced(torch.Tensor([[49406, 4160]]).long()) works so an input shape of (1,2) should be valid.
The following works:
# Convert traced model to CoreML
text_input_shape = ct.Shape(shape=(1,
ct.RangeDim(lower_bound=2, upper_bound=77, default=77)))
model_coreml = ct.convert(
model_traced,
inputs=[ct.TensorType(name="input_text_token_ids", shape=text_input_shape, dtype=np.int64)],
outputs=[ct.TensorType(name="output_embedding")],
convert_to="neuralnetwork"
)
Since we can convert to the neuralnetwork backend, this looks like an issue with the Core ML Framework rather than the conversion process. In which case the correct place to report this issue is using the Feedback Assistant.