coremltools coreml using torch export API support maxpool

🌱 Describe your Feature Request

coremltools export API failed to export maxpool, because MaxPool not adapt for "_parse_positional_args" and "_parse_keyword_args" parse for argument as "_convolution"

How can this feature be used?

maxpool is a very common op in convolution network. without this support some popular network cannot export successful

Dec 09 '24 09:12 lgyStoic

I'm not understanding the issue. Can you give us some toy code that demonstrates the issue?

Dec 09 '24 22:12 TobyRoseman

I'm not understanding the issue. Can you give us some toy code that demonstrates the issue?

@TobyRoseman

import torch
import torch.nn as nn

class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 16, kernel_size=3)
        self.pool1 = nn.MaxPool2d(2, stride=2)
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(16 * 13 * 13, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = self.pool1(x)
        x = self.flatten(x)
        x = self.fc1(x)
        return x

torch_model = SimpleCNN()

example_inputs = (torch.rand(1, 1, 28, 28),)
exported_program = torch.export.export(torch_model, example_inputs)
print(exported_program)
torch.export.save(exported_program, "test.export.pt")

as the simple network has Maxpool op. using coremltools unified converter, it doesn't support, has error blow

ERROR - converting 'max_pool2d' op (located at: 'max_pool2d'):

Converting PyTorch Frontend ==> MIL Ops:  25%|██████████▌                               | 1/4 [00:00<00:00, 374.96 ops/s]
Traceback (most recent call last):
  File "/Users/garryling/workspace/pytorch_learn/coreml_export.py", line 11, in <module>
    model_from_export = ct.convert(saved_exported_program, minimum_deployment_target=ct.target.iOS13)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/_converters_entry.py", line 635, in convert
    mlmodel = mil_convert(
              ^^^^^^^^^^^^
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/mil/converter.py", line 188, in mil_convert
    return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/mil/converter.py", line 212, in _mil_convert
    proto, mil_program = mil_convert_to_proto(
                         ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/mil/converter.py", line 288, in mil_convert_to_proto
    prog = frontend_converter(model, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/mil/converter.py", line 108, in __call__
    return load(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/mil/frontend/torch/load.py", line 88, in load
    return _perform_torch_convert(converter, debug)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/mil/frontend/torch/load.py", line 151, in _perform_torch_convert
    prog = converter.convert()
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/mil/frontend/torch/converter.py", line 1387, in convert
    convert_nodes(self.context, self.graph, early_exit=not has_states)
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/mil/frontend/torch/ops.py", line 120, in convert_nodes
    raise e     # re-raise exception
    ^^^^^^^
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/mil/frontend/torch/ops.py", line 115, in convert_nodes
    convert_single_node(context, node)
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/mil/frontend/torch/ops.py", line 179, in convert_single_node
    add_op(context, node)
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/mil/frontend/torch/ops.py", line 1843, in max_pool1d
    inputs = _get_inputs(context, node, min_expected=3)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/mil/frontend/torch/ops.py", line 305, in _get_inputs
    check_if_number_of_inputs_more_than_min_expected(len(inputs), min_expected)
  File "/Users/garryling/workspace/pytorch_learn/coremltools/converters/mil/frontend/torch/ops.py", line 286, in check_if_number_of_inputs_more_than_min_expected
    raise ValueError(
ValueError: node max_pool2d (max_pool2d) got 2 input(s), expected minimum 3 inputs

I guess this is missing some default argument about maxpool op, so I review the source code about coremltools using pytorch export API ,I found like convolution op, there is work fine ,because this op has adapt for export API, like below

def _parse_positional_args(context, node) -> Tuple[Var]:
        inputs = _get_inputs(
            context,
            node,
            min_expected={
                TorchFrontend.TORCHSCRIPT: 7,
                TorchFrontend.TORCHEXPORT: 2,
                TorchFrontend.EXECUTORCH: 2,
            },
        )
        nargs = len(inputs)

        x = inputs[0]
        # PyTorch and MIL has same weight layout
        # Conv: [Cout, Cin, *D]
        # ConvTranspose: [Cin, Cout, *D]
        weight = inputs[1]
        x, weight = promote_input_dtypes([x, weight])

        bias = inputs[2] if nargs > 2 else None
        stride = inputs[3] if nargs > 3 else 1
        padding = inputs[4] if nargs > 4 else default_torch_padding

        if node.kind in ("_convolution", "convolution"):
            dilation = inputs[5] if nargs > 5 else 1
            transposed = inputs[6].val if nargs > 6 else False
            out_padding = inputs[7] if nargs > 7 else 0
            groups = inputs[8] if nargs > 8 else 1
        elif re.match(r"conv_transpose[123]d.*", node.kind):
            out_padding = inputs[5] if nargs > 5 else 0
            groups = inputs[6] if nargs > 6 else 1
            dilation = inputs[7] if nargs > 7 else 1
            transposed = True
        else:
            dilation = inputs[5] if nargs > 5 else 1
            groups = inputs[6] if nargs > 6 else 1
            transposed = False
            out_padding = 0

        return x, weight, bias, stride, padding, dilation, groups, transposed, out_padding

    def _parse_keyword_args(
        context, node, bias, stride, padding, dilation, groups, out_padding
    ) -> Tuple[Var]:
        # Only torch.export may have kwargs
        if context.frontend not in TORCH_EXPORT_BASED_FRONTENDS:
            return bias, stride, padding, dilation, groups, out_padding

        bias = _get_kwinputs(context, node, "bias", default=[bias])[0]
        stride = _get_kwinputs(context, node, "stride", default=[stride])[0]
        padding = _get_kwinputs(context, node, "padding", default=[padding])[0]
        dilation = _get_kwinputs(context, node, "dilation", default=[dilation])[0]
        groups = _get_kwinputs(context, node, "groups", default=[groups])[0]
        out_padding = _get_kwinputs(context, node, "out_padding", default=[out_padding])[0]

        return bias, stride, padding, dilation, groups, out_padding

or there is another way like avg_pool this code also adapt for TORCHEXPORT, so I gusses coremtools doesn't adapt for maxpool op. My request is ask feature for this op. Thx.

Dec 10 '24 02:12 lgyStoic

Thanks @lgyStoic for finding the failing case! Maxpool and avgpool were passing our existing tests so we didn't realize they have different signature in torch.export from torch.jit.trace

I would recommend to adapt like convolution. I guess someday avgpool would also need to be adapted

Dec 12 '24 01:12 YifanShenSZ

@YifanShenSZ Is there any timeline or official version will coremltools fix this bug? or If I can contribution for is feature?

Dec 12 '24 01:12 lgyStoic

We would greatly appreciate if you can contribute for this feature 🙏

Dec 12 '24 01:12 YifanShenSZ

Is there an update on this issue? Maybe a workaround while it's being fixed?

Dec 30 '24 11:12 aleksey-chugaev

Is there an update on this issue? Maybe a workaround while it's being fixed?

@aleksey-chugaev no time for this issue currently, maybe can export using torch jit trace to workaround..

Jan 02 '25 03:01 lgyStoic

What is your torch version used? I tried to reproduce locally on tot coremltools, with torch 2.4 and 2.5, but both work well

import numpy as np

import torch
import torch.nn as nn

import coremltools as ct


class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 16, kernel_size=3)
        self.pool1 = nn.MaxPool2d(2, stride=2)
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(16 * 13 * 13, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = self.pool1(x)
        x = self.flatten(x)
        x = self.fc1(x)
        return x


torch_model = SimpleCNN()

example_inputs = (torch.rand(1, 1, 28, 28),)
exported_program = torch.export.export(torch_model, example_inputs)
print(exported_program)


mlmodel = ct.convert(
    exported_program,
    outputs=[ct.TensorType(name="output")],
    compute_units=ct.ComputeUnit.CPU_ONLY,
)


torch_output = torch_model(example_inputs[0]).detach().numpy()
coreml_output = mlmodel.predict({"x": example_inputs[0].detach().numpy()})["output"]
np.testing.assert_allclose(coreml_output, torch_output, rtol=1e-3, atol=1e-3)

Jan 03 '25 22:01 YifanShenSZ

I'm using torch=2.4.1 coremltools=8.1 python=3.9

Jan 05 '25 21:01 aleksey-chugaev

I'm using torch=2.4.1 coremltools=8.1 python=3.9

Tried the same config

(py39-torch24) local@Yifans-M2-MacBook-Pro Test % pip list |grep coremltools
coremltools       8.1
(py39-torch24) local@Yifans-M2-MacBook-Pro Test % pip list |grep torch      
torch             2.4.1

The reproduce above works well on my local

There are 3 ways we can approach:

@lgyStoic implement the torch.export maxpool adaptation and test on your case
Send me another reproduce and see if that reproduces on my local
I implement the torch.export maxpool adaptation, but without reproduce, so it may still fail on your case even with my change

Jan 06 '25 00:01 YifanShenSZ