onnx-coreml icon indicating copy to clipboard operation
onnx-coreml copied to clipboard

Issue converting Slice op when comes after Expand op

Open dkossnick-figma opened this issue 5 years ago • 1 comments

🐞Describe the bug

When converting a model that has a Slice op immediately after an Expand op, it crashes.

Trace

Requirement already satisfied: onnx==1.6.0 in /usr/local/lib/python3.6/dist-packages (1.6.0)
Requirement already satisfied: typing-extensions>=3.6.2.1 in /usr/local/lib/python3.6/dist-packages (from onnx==1.6.0) (3.6.6)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from onnx==1.6.0) (1.12.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from onnx==1.6.0) (1.17.4)
Requirement already satisfied: protobuf in /usr/local/lib/python3.6/dist-packages (from onnx==1.6.0) (3.10.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.6/dist-packages (from protobuf->onnx==1.6.0) (42.0.2)
Requirement already satisfied: onnxruntime in /usr/local/lib/python3.6/dist-packages (1.0.0)
Requirement already satisfied: coremltools in /usr/local/lib/python3.6/dist-packages (3.1)
Requirement already satisfied: protobuf>=3.1.0 in /usr/local/lib/python3.6/dist-packages (from coremltools) (3.10.0)
Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python3.6/dist-packages (from coremltools) (1.12.0)
Requirement already satisfied: numpy>=1.14.5 in /usr/local/lib/python3.6/dist-packages (from coremltools) (1.17.4)
Requirement already satisfied: setuptools in /usr/local/lib/python3.6/dist-packages (from protobuf>=3.1.0->coremltools) (42.0.2)
Requirement already satisfied: onnx_coreml==1.1.0 in /usr/local/lib/python3.6/dist-packages (1.1)
Requirement already satisfied: typing-extensions>=3.6.2.1 in /usr/local/lib/python3.6/dist-packages (from onnx_coreml==1.1.0) (3.6.6)
Requirement already satisfied: click in /usr/local/lib/python3.6/dist-packages (from onnx_coreml==1.1.0) (7.0)
Requirement already satisfied: sympy in /usr/local/lib/python3.6/dist-packages (from onnx_coreml==1.1.0) (1.1.1)
Requirement already satisfied: onnx>=1.5.0 in /usr/local/lib/python3.6/dist-packages (from onnx_coreml==1.1.0) (1.6.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from onnx_coreml==1.1.0) (1.17.4)
Requirement already satisfied: coremltools>=3.0 in /usr/local/lib/python3.6/dist-packages (from onnx_coreml==1.1.0) (3.1)
Requirement already satisfied: typing>=3.6.4 in /usr/local/lib/python3.6/dist-packages (from onnx_coreml==1.1.0) (3.6.6)
Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.6/dist-packages (from sympy->onnx_coreml==1.1.0) (1.1.0)
Requirement already satisfied: protobuf in /usr/local/lib/python3.6/dist-packages (from onnx>=1.5.0->onnx_coreml==1.1.0) (3.10.0)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from onnx>=1.5.0->onnx_coreml==1.1.0) (1.12.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.6/dist-packages (from protobuf->onnx>=1.5.0->onnx_coreml==1.1.0) (42.0.2)
The default version of TensorFlow in Colab will soon switch to TensorFlow 2.x.
We recommend you upgrade now or ensure your notebook will continue to use TensorFlow 1.x via the %tensorflow_version 1.x magic: more info.

WARNING:root:TensorFlow version 1.15.0 detected. Last version known to be fully compatible is 1.14.0 .
WARNING:root:Keras version 2.2.5 detected. Last version known to be fully compatible of Keras is 2.2.4 .
Torch version: 1.3.1
Onnx version: 1.6.0
onnxruntime version: 1.0.0
Coremltools version: 3.1
Converting model to ONNX...
graph(%input : Float(1, 6, 4, 4)):
  %1 : Long() = onnx::Constant[value={1}](), scope: TestModel
  %2 : Tensor = onnx::Shape(%input), scope: TestModel
  %3 : Long() = onnx::Gather[axis=0](%2, %1), scope: TestModel # <ipython-input-1-5d4af5351fc2>:25:0
  %4 : Long() = onnx::Constant[value={2}](), scope: TestModel
  %5 : Tensor = onnx::Shape(%input), scope: TestModel
  %6 : Long() = onnx::Gather[axis=0](%5, %4), scope: TestModel # <ipython-input-1-5d4af5351fc2>:25:0
  %7 : Long() = onnx::Constant[value={3}](), scope: TestModel
  %8 : Tensor = onnx::Shape(%input), scope: TestModel
  %9 : Long() = onnx::Gather[axis=0](%8, %7), scope: TestModel # <ipython-input-1-5d4af5351fc2>:25:0
  %10 : Long() = onnx::Constant[value={3}](), scope: TestModel
  %11 : Tensor = onnx::Unsqueeze[axes=[0]](%10)
  %12 : Tensor = onnx::Unsqueeze[axes=[0]](%3)
  %13 : Tensor = onnx::Unsqueeze[axes=[0]](%6)
  %14 : Tensor = onnx::Unsqueeze[axes=[0]](%9)
  %15 : Tensor = onnx::Concat[axis=0](%11, %12, %13, %14)
  %16 : Float(3, 6, 4, 4) = onnx::Expand(%input, %15), scope: TestModel # <ipython-input-1-5d4af5351fc2>:25:0
  %17 : Tensor = onnx::Constant[value={1}](), scope: TestModel
  %18 : Tensor = onnx::Constant[value={0}](), scope: TestModel
  %19 : Tensor = onnx::Constant[value={3}](), scope: TestModel
  %20 : Float(3, 3, 4, 4) = onnx::Slice(%16, %18, %19, %17), scope: TestModel # <ipython-input-1-5d4af5351fc2>:27:0
  %21 : Tensor = onnx::Constant[value={1}](), scope: TestModel
  %22 : Tensor = onnx::Constant[value={3}](), scope: TestModel
  %23 : Tensor = onnx::Constant[value={6}](), scope: TestModel
  %24 : Float(3, 3, 4, 4) = onnx::Slice(%16, %22, %23, %21), scope: TestModel # <ipython-input-1-5d4af5351fc2>:28:0
  %25 : Float(3, 3, 4, 4) = onnx::Mul(%20, %24), scope: TestModel # <ipython-input-1-5d4af5351fc2>:29:0
  %26 : Tensor = onnx::Constant[value={0}](), scope: TestModel
  %27 : Tensor = onnx::Constant[value={0}](), scope: TestModel
  %28 : Tensor = onnx::Constant[value={1}](), scope: TestModel
  %output : Float(1, 3, 4, 4) = onnx::Slice(%25, %27, %28, %26), scope: TestModel # <ipython-input-1-5d4af5351fc2>:30:0
  return (%output)

1/5: Converting Node Type Expand
2/5: Converting Node Type Slice
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:26: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:28: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-1-5d4af5351fc2> in <module>()
     48 onnx.checker.check_model(onnx_model)
     49 ort_session = onnxruntime.InferenceSession(onnx_name)
---> 50 mlmodel = convert(onnx_model, minimum_ios_deployment_target="13", image_input_names=[], image_output_names=["output"])
     51 print("All done")

2 frames
/usr/local/lib/python3.6/dist-packages/onnx_coreml/_operators_nd.py in _convert_slice(builder, node, graph, err)
   1949        return _convert_slice_ir4v9(builder, node, graph, err)
   1950 
-> 1951     data_shape = graph.shape_dict[node.inputs[0]]
   1952     len_of_data = len(data_shape)
   1953     begin_masks = [True] * len_of_data

KeyError: '16'

To Reproduce

I made a simple Google Colab notebook to reproduce this issue on a toy model: https://colab.research.google.com/drive/1QCQjskJnxBY7yKCVEB04ReXAwgU7zYC0#scrollTo=_aVyHsbZho1Y. Also pasted below:

%pip install onnx==1.6.0
%pip install onnxruntime
%pip install coremltools
%pip install onnx_coreml==1.1.0
import onnx
import onnxruntime
import torch.onnx
import numpy as np
import torch.nn.functional as F
import torch.nn as nn
import coremltools
import onnx_coreml
from onnx_coreml import convert

print(f"Torch version: {torch.__version__}")
print(f"Onnx version: {onnx.__version__}")
print(f"onnxruntime version: {onnxruntime.__version__}")
print(f"Coremltools version: {coremltools.__version__}")

class TestModel(nn.Module):
    def __init__(self):
        super().__init__()
    
    def forward(self, x):
        x = x.expand(3,x.shape[1], x.shape[2], x.shape[3])
        split = int(x.shape[1]) // 2
        gamma = x.narrow(1,0,split)
        beta=x.narrow(1,split,int(x.shape[1])-split)
        combined = gamma * beta
        return combined.narrow(0,0,1)
    
model = TestModel()
data = torch.randn([1, 6, 4, 4])
output = model(data)
print("Converting model to ONNX...")
onnx_name = f"slice_model.onnx"
torch.onnx.export(model,               # model being run
                  data,                         # model input (or a tuple for multiple inputs)
                  onnx_name,   # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=11,          # the ONNX version to export the model to
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ["input"],   # the model's input names
                  output_names = ["output"], # the model's output names
                  verbose=True
                  )
onnx_model = onnx.load(onnx_name)
onnx.checker.check_model(onnx_model)
ort_session = onnxruntime.InferenceSession(onnx_name)
mlmodel = convert(onnx_model, minimum_ios_deployment_target="13", image_input_names=[], image_output_names=["output"])
print("All done")

Here is a copy of the failing onnx model: https://drive.google.com/file/d/1dbbWEBUGC0UIW1AjPP9GzDIIQcEpBF5S/view?usp=sharing.

image

System environment (please complete the following information):

  • coremltools version : 3.1
  • onnx-coreml version: 1.1.0
  • OS: Linux
  • How you install python: conda (locally, in more complex env that Colab)
  • python version (e.g. 3.7): 3.6

Additional context

Using repeat instead of expand in PyTorch worked (turned it into a Tile op in onnx). (I'm guessing due to https://github.com/onnx/onnx-coreml/pull/495)

Below PyTorch's narrow method, I tried using slicing of the syntax x[:,:split,:,:] which has the same problem.

dkossnick-figma avatar Dec 16 '19 21:12 dkossnick-figma

@kossnick this is due to shape unknown when expand op is present in onnx shape inference. have a #528 PR to print out error message correctly.

bhushan23 avatar Jan 03 '20 15:01 bhushan23