coremltools
coremltools copied to clipboard
Convert HuggingFace DistillBert to CoreML Op "word_embeddings" (op_type: gather) expects integer tensor or scalar but got tensor[1,512,fp32]
Hello I'm trying to convert a fine-tuned Huggingface DistilBertForSequenceClassification model to CoreML but conversion fails with the following error:
InputTypeError: Op "word_embeddings" (op_type: gather) input indices="input_ids" expects integer tensor or scalar but got tensor[1,512,fp32]
My code looks like this:
model_inputs = tokenizer(text, return_tensors="pt", truncation=True, padding='max_length')
input_ids = torch.tensor(model_inputs['input_ids'], dtype=torch.long)
attention_mask = torch.tensor(model_inputs['attention_mask'], dtype=torch.long)
labels = torch.tensor([0], dtype=torch.long)
traced_model = torch.jit.trace(bert_finetuner, [input_ids, attention_mask, labels])
import coremltools as ct
input1 = ct.TensorType(name='input_ids', shape=input_ids.size())
input2 = ct.TensorType(name='attention_mask', shape=(1, 512))
input3 = ct.TensorType(name='label', shape=(1,1))
mlmodel = ct.convert(traced_model, inputs=[input1, input2, input3])
Pytorch version 1.7.0 coreml tools 4.0
Edit: I also tested with a vanilla DistilBert and other Pytorch and coreML-Tools version on Google Colab and always get the same error message. There seems to be something wrong with op_type: gather
not accepting the float tensors of the model.
I got same issue.. Are you solve it ?
InputTypeError: Op "inputs_embeds" (op_type: gather) input indices="input_ids" expects integer tensor or scalar but got tensor[1024,is23,fp32]
I couldn't solve it directly but I managed to convert the model going over ONNX.
torch.onnx.export(bert_finetuner.model, (input_ids, attention_mask), "./bert4.onnx", verbose=True)
modelBert3 = ct.converters.onnx.convert(model='./bert4.onnx', minimum_ios_deployment_target='13')
modelBert3.save("./Distilbert3S.mlmodel")
Didn't test yet if the model actually performs inference.
In general coreML still seems very limited and not mature.
I couldn't solve it directly but I managed to convert the model going over ONNX.
torch.onnx.export(bert_finetuner.model, (input_ids, attention_mask), "./bert4.onnx", verbose=True) modelBert3 = ct.converters.onnx.convert(model='./bert4.onnx', minimum_ios_deployment_target='13') modelBert3.save("./Distilbert3S.mlmodel")
Didn't test yet if the model actually performs inference.
In general coreML still seems very limited and not mature.
I am using a different language model. Unfortunately, your solution doesn't work for me. But, Thank you so much !
try to specify 'dtype' for inputs: eg:
from coremltools.converters.mil.mil import types
input1 = ct.TensorType(name='input_ids', shape=input_ids.size(), dtype=types.int64)
In order to help here, I need to be able to reproduce the problem. Could someone provide self-contained code to reproduce the issue (complete with any necessary links to the model files)?
I ran into this as well converting the CLIP text model. I have done it before so my guess this is a regression. Going to try older versions of coremltools.
Edit: Latest 6.0b2 works.
Since we have not received steps to reproduce this problem, I'm going to close this issue. If we get steps to reproduce the problem, I will reopen the issue.
ValueError: Op "137" (op_type: fill) Input shape="136" expects tensor or scalar of dtype from type domai
try to specify 'dtype' for inputs: eg:
from coremltools.converters.mil.mil import types input1 = ct.TensorType(name='input_ids', shape=input_ids.size(), dtype=types.int64)
This helped me!