About Image Input and Output.

Open jhl13 opened this issue 1 year ago • 1 comments

I have a few questions and concerns. I have a denoising model where I preprocess the input by dividing it by 255 and postprocess the output by multiplying it by 255. However, when I use image input and output, I encounter the following issues:

When I useinput = ct.ImageType(name='input', shape=(1, 3, 1080, 1920), color_layout=ct.colorlayout.RGB, scale=1/255.) as the input conversion for the model, it inserts a mul node, but this node performs calculations in fp32 which is very slow. Is there a way to force the scale node to use fp16 calculations? Additionally, because subsequent convolution operations default to using fp16, it further increases the need to add a cast operator to convert the fp32 output of the mul operator to fp16 output.
output = ct.ImageType(name='output', color_layout=ct.colorlayout.RGB), the scale must be set to 1.0, which is very inconvenient to use and requires additional post-processing.

Is there a way to solve these issues?

May 06 '24 06:05 jhl13

1 - Try passing compute_precision=coremltools.precision.FLOAT16 to coremltools.convert.

2 - I don't understand the issue here. Do you want your model accept images or different sizes?

May 06 '24 20:05 TobyRoseman