Wrong output from PyTorch converted model
🐞Describe the bug
The output of the PyTorch and CoreML models have a big mismatch even given a high tolerance.
To Reproduce
import numpy as np
import torch
from transformers import AutoTokenizer, AutoModel
import coremltools as ct
sentences = ["This is a test."]
tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2')
model = AutoModel.from_pretrained('sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2', torchscript=True).eval()
encoded_input = tokenizer(sentences, return_tensors='pt')
traced_model = torch.jit.trace(model, tuple(encoded_input.values()))
scripted_model = torch.jit.script(traced_model)
model = ct.convert(scripted_model, source="pytorch",
inputs=[ct.TensorType(name="input_ids", shape=(ct.RangeDim(), ct.RangeDim()), dtype=np.int32),
ct.TensorType(name="token_type_ids", shape=(ct.RangeDim(), ct.RangeDim()), dtype=np.int32),
ct.TensorType(name="attention_mask", shape=(ct.RangeDim(), ct.RangeDim()), dtype=np.int32)],
convert_to="neuralnetwork", compute_units=ct.ComputeUnit.CPU_ONLY)
with torch.no_grad():
pt_out = scripted_model(**encoded_input)
cml_inputs = {k: v.to(torch.int32).numpy() for k, v in encoded_input.items()}
pred_coreml = model.predict(cml_inputs)
np.testing.assert_allclose(pt_out[0].detach().numpy(), pred_coreml["hidden_states"], atol=1e-5, rtol=1e-4)
System environment:
- coremltools version: 5.2.0
- OS (e.g. MacOS version or Linux type): MacOS M1 12.3.1
- Any other relevant version information:
- e.g. PyTorch or TensorFlow version: PyTorch 1.11.0 and Transformers 4.18.0
- Python version: 3.8.13 (conda)
Any updates about this? I also noticed that the outputs from the PyTorch model and the converted CoreML model are not exactly the same.
This is also an issue when converting to mlprogram.
Specifying the input sizes does not help either, i.e. converting with the following does not help:
model = ct.convert(
scripted_model,
source="pytorch",
inputs=[
ct.TensorType(name="input_ids", shape=encoded_input["input_ids"].shape, dtype=np.int32),
ct.TensorType(name="token_type_ids", shape=encoded_input["token_type_ids"].shape, dtype=np.int32),
ct.TensorType(name="attention_mask", shape=encoded_input["attention_mask"].shape, dtype=np.int32)
],
convert_to="mlprogram",
compute_units=ct.ComputeUnit.CPU_ONLY
)
Hi, I also meet a similar issue. Any updates about this?
An absolute tolerance of 1e-5 and relative tolerance of 1e-4 is too low for a model of this size. The output of the Core ML model and PyTorch model are quite close.
The outputs match even more closely if you convert traced_model rather than scripted_model.