Anyway to pass custom preprocess logic?
❓Question
Hi, I was wondering if there is anyway to include custom preprocessing logic during the model conversion process.
Regarding preprocesing, I could not see anything besides very basic, predefined preprocessing calls like below:
scale = 1/(0.226*255.0)
bias = [- 0.485/(0.229) , - 0.456/(0.224), - 0.406/(0.225)]
image_input = ct.ImageType(name="input_1",
shape=example_input.shape,
scale=scale, bias=bias)
What I would like to acheive is image padding as a way to work around this issue #1957. Because things break down when flexible shape is enabled, I was thinking that maybe if I pad the images which all differ in size, to very large, fixed size (say 2048, 2048) during runtime, and do inference on that (and later unpad it), I can live with the fixed size requirements, eventhough it's not really optimal.
Yes, I know that it would be totally possible to just write the preprocessing in the app side, but if I can embed the preprocessing to the model itself using simple Python, it would make my life much easier. Unfortunately, adding the pad logic inside the model definition fails:
class CoreMLaMa(torch.nn.Module):
def __init__(self, lama):
super(CoreMLaMa, self).__init__()
self.lama = lama
def autopad(self, x):
"""some custom logic to pad images"""
pass
def forward(self, image, mask):
"""Things crash before even reaching autopad""""
image = self.autopad(image)
mask = self.autopad(mask)
normalized_mask = ((mask > 0) * 1).byte()
lama_out = self.lama(image, normalized_mask)
output = torch.clamp(lama_out * 255, min=0, max=255)
return output
Because before entering the forward padding logic, there is dimension check happening at model.predict() very first, which throws RuntimeError. Is there anyway to make the preprocess logic kick in before the dimension check?
I saw some post talking about proto files for preprocessing, but this doesn't seem like a conventional way of preprocessing.
Is there anyway to make this work?
I did kind of find a way to bypass the problem.
You can create a padding network
class PadLayer(torch.nn.Module):
def __init__(self):
super().__init__()
def forward(self, x):
x, _ = autopad(x, factor=2048, mode="constant", value=0, margin=(0, 0))
return x
That acts as a preprocessor, and then save it as the legacy mlmodel, which allows sligtly more control in terms of updating the shapes.
traced_model = torch.jit.trace(pad_model, image_tensor)
model = ct.convert(traced_model,
inputs=[ct.TensorType(shape=image_tensor.shape, name="input")],
outputs=[ct.TensorType(name="output")],
convert_to="neuralnetwork",
)
model.save("model/pad.mlmodel")
# 3. Do post processing to make it flexible shape
spec = ct.utils.load_spec('model/pad.mlmodel')
input_name = spec.description.input[0].name
print("Input name: ", input_name)
flexible_shape_utils.set_multiarray_ndshape_range(spec,
feature_name=input_name,
lower_bounds=[1,3, 256, 256],
upper_bounds=[1,3, 2048, 2048],)
#save again
save_spec(spec, "model/updated.mlmodel")
By passing the inputs to this preprocessor model first, you can bypass the problems coming from absence of custom preprocessor functionality.
However, this is a dirty solution and I would love to know if there is a better otpion than this.
There is not way to add this preprocessing. I recommend you add this functionality to your PyTorch model prior to conversion. As you already mention, adding this functionality directly to your app would also work.
Adding more preprocessing functionality would require changes to the Core ML Framework (i.e. that change can not be added in this GitHub repository). Please submit your Core ML Framework feature request using the the Feedback Assistant.