coremltools icon indicating copy to clipboard operation
coremltools copied to clipboard

ValueError: # of input channels is44 not divisible by groups 1

Open sniklaus opened this issue 1 year ago • 8 comments

Description

coremltools.convert will unnecessarily raise an exception if C_in is a Symbol here.

Stack Trace

Traceback (most recent call last):
  File "/home/sniklaus/Downloads/bidir_conversion.py", line 436, in <module>
    objModel = coremltools.convert(
               ^^^^^^^^^^^^^^^^^^^^
  File "coremltools/converters/_converters_entry.py", line 581, in convert
    mlmodel = mil_convert(
              ^^^^^^^^^^^^
  File "coremltools/converters/mil/converter.py", line 188, in mil_convert
    return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "coremltools/converters/mil/converter.py", line 212, in _mil_convert
    proto, mil_program = mil_convert_to_proto(
                         ^^^^^^^^^^^^^^^^^^^^^
  File "coremltools/converters/mil/converter.py", line 288, in mil_convert_to_proto
    prog = frontend_converter(model, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "coremltools/converters/mil/converter.py", line 108, in __call__
    return load(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "coremltools/converters/mil/frontend/torch/load.py", line 82, in load
    return _perform_torch_convert(converter, debug)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "coremltools/converters/mil/frontend/torch/load.py", line 116, in _perform_torch_convert
    prog = converter.convert()
           ^^^^^^^^^^^^^^^^^^^
  File "coremltools/converters/mil/frontend/torch/converter.py", line 581, in convert
    convert_nodes(self.context, self.graph)
  File "coremltools/converters/mil/frontend/torch/ops.py", line 86, in convert_nodes
    raise e     # re-raise exception
    ^^^^^^^
  File "coremltools/converters/mil/frontend/torch/ops.py", line 81, in convert_nodes
    convert_single_node(context, node)
  File "coremltools/converters/mil/frontend/torch/ops.py", line 134, in convert_single_node
    add_op(context, node)
  File "coremltools/converters/mil/frontend/torch/ops.py", line 1088, in _convolution
    conv = mb.conv(**kwargs)
           ^^^^^^^^^^^^^^^^^
  File "coremltools/converters/mil/mil/ops/registry.py", line 182, in add_op
    return cls._add_op(op_cls_to_add, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "coremltools/converters/mil/mil/builder.py", line 202, in _add_op
    new_op.type_value_inference()
  File "coremltools/converters/mil/mil/operation.py", line 258, in type_value_inference
    output_types = self.type_inference()
                   ^^^^^^^^^^^^^^^^^^^^^
  File "coremltools/converters/mil/mil/ops/defs/iOS15/conv.py", line 165, in type_inference
    raise ValueError(msg.format(C_in, groups))
ValueError: # of input channels is44 not divisible by groups 1

To Reproduce

I am afraid that I can't share the model details here since it is proprietary, my apologies.

System

  • coremltools version: 7.2
  • OS (e.g. MacOS version or Linux type): Debian 13
  • PyTorch version: 2.1.2

sniklaus avatar Sep 11 '24 16:09 sniklaus

Why is this unnecessarily? Can you construct a minimal example to reproduce this issue?

Also 7.2 is several months old. Please try with 8.0b2.

TobyRoseman avatar Sep 11 '24 17:09 TobyRoseman

Notice the stack trace.

ValueError: # of input channels is44 not divisible by groups 1

44 is perfectly divisible by 1, but Symbol(44) % groups != 0 will fail because C_in is a symbol.

Just tried 8.0b2 and got the same error.

sniklaus avatar Sep 11 '24 17:09 sniklaus

is44 is the name of the symbol. It doesn't mean 44.

Are you using flexible shape when you call ct.convert?

TobyRoseman avatar Sep 11 '24 17:09 TobyRoseman

Ah yes, I forgot to mention, this happens when switching the input to the model from

coremltools.TensorType(name='input', shape=(1, 3, 1024, 1024))

to

coremltools.TensorType(name='input', shape=(coremltools.RangeDim(lower_bound=1, upper_bound=8, default=1), 3, 1024, 1024))

and it exports fine without the flexible shape.

is44 is the name of the symbol. It doesn't mean 44.

Regardless of what it means, Symbol(44) % 1 will yield Mod(Symbol(44), 1) which is not 0, hence erroneously triggering the exception (at least for my case it is erroneous, any channel size is divisible by a group size of one).

sniklaus avatar Sep 11 '24 18:09 sniklaus

is44 just mean it's the 44th symbol created during the conversion. It doesn't tell you anything about its' values. In fact the value is not known at conversion time.

Have you verified that your traced PyTorch model actually works with all input shapes in that range?

It looks like you're trying to make your converted model accept batched input. If so, have considered not using flexible shapes and just using the built in batch predictions? For details see Example in API docs.

TobyRoseman avatar Sep 11 '24 18:09 TobyRoseman

is44 just mean it's the 44th symbol created during the conversion. It doesn't tell you anything about its' values. In fact the value is not known at conversion time.

Yet, it seems like coremltools wants to check its value during conversion.

Have you verified that your traced PyTorch model actually works with all input shapes in that range?

The following worked just fine:

for i in tqdm.tqdm(range(1, 9)):
    y = traced_model(torch.rand(i, 3, 1024, 1024))

It looks like you're trying to make your converted model accept batched input.

Correct, though the requirement comes from some internal product team. I forwarded your feedback to them. On this note, I also commented out the two checks in question and the model exported fine. Still waiting to hear back from the product team to see whether it actually works though.

sniklaus avatar Sep 11 '24 19:09 sniklaus

I also commented out the two checks in question and the model exported fine.

Are you also able to get correct predictions from that exported Core ML model?

TobyRoseman avatar Sep 11 '24 21:09 TobyRoseman

I was wondering about that as well and am waiting for the internal product team to investigate and report back.

sniklaus avatar Sep 11 '24 21:09 sniklaus