coremltools Skip casting model inputs to fp32 if weights and inputs are all fp16

Converting models with fp16 weights and fp16 inputs currently fails with the following exception:

ValueError: In op, of type linear, named linear_0, the named input `bias` must have the same data type as the named input `weight`. However, bias has dtype fp16 whereas weight has dtype fp32.

Minimal repro:

import coremltools as ct
import numpy as np
import torch


class Net(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.proj = torch.nn.Linear(16, 1)

    def forward(self, x):
        return self.proj(x)


x = torch.randn(1, 16, dtype=torch.float16)
mlmodel = ct.convert(
    torch.jit.trace(Net().half().eval(), x),
    inputs=[ct.TensorType(name="x", shape=x.shape, dtype=np.float16)],
    outputs=[ct.TensorType(name="output")],
    convert_to="mlprogram",
    compute_precision=ct.precision.FLOAT16,
    minimum_deployment_target=ct.target.iOS16,
)

Jul 11 '24 17:07 jeethu

#2241 tries to fix the same issue but does so incorrectly.

Jul 12 '24 09:07 jeethu

Hi @jeethu, inputs=[ct.TensorType(name="x", shape=x.shape, dtype=np.float16)] and compute_precision=ct.precision.FLOAT16 are enough to obtain a fp16-input fp16-computation Core ML model. There is no need to make the PyTorch model itself fp16

x = torch.randn(1, 16, dtype=torch.float32)
mlmodel = ct.convert(
    torch.jit.trace(Net().eval(), x),
    inputs=[ct.TensorType(name="x", shape=x.shape, dtype=np.float16)],
    outputs=[ct.TensorType(name="output")],
    convert_to="mlprogram",
    compute_precision=ct.precision.FLOAT16,
    minimum_deployment_target=ct.target.iOS16,
)

Please give it a try and see if it works for you

Jul 26 '24 18:07 YifanShenSZ

Concretely, internally we translate torch model in fp32. Then,

If given compute_precision=ct.precision.FLOAT16, we will insert fp16 casts to make computation (i.e. weight & activation) fp16
If given inputs=[ct.TensorType(name="x", shape=x.shape, dtype=np.float16)], we will change input signature for x to fp16

Jul 26 '24 18:07 YifanShenSZ