axon_onnx Support for `op

WIP Implementation for Resize Operator

fixes https://github.com/elixir-nx/axon_onnx/issues/34

Jun 27 '22 10:06 jnnks

@seanmor5 Is this the right place for the method? :D

Jun 27 '22 10:06 jnnks

@jnnks Yes!

Jun 27 '22 10:06 seanmor5

Regarding the tests: Where can I get the backend-test-tools? mix test cannot find System.cmd("backend-test-tools", ["generate-data"], [])

Jul 01 '22 17:07 jnnks

I believe it comes with ONNX runtime Python package!

Jul 01 '22 20:07 seanmor5

I am mostly working to comply with the unit tests now. Couple of questions and comments:

roi, scales, sizes are input layers (%Axon{}), how can I get the values of interest like above with params?
roi = region of interest? That means we have to crop the Tensor, right? Is there a function I can use for it?
- Looks like Nx.slice may be helpful here
If I give all the values the first test (test_resize_downsample_scales_cubic) is expecting, I get an error I don't understand: ** (ArgumentError) cannot broadcast tensor of dimensions {1, 1, 3, 3} to {1, 1, 8, 8}

Seems like this will take a bit longer than expected :D

Jul 08 '22 15:07 jnnks

Is there something I could do to help here? I could use this PR but it seems "frozen". Any major changes in the approach/library?

Jul 27 '22 21:07 NotQuiteLagom

@jnnks @NotQuiteLagom Sorry I let this one fall off. There is an open issue upstream to fix Axon's resize operator. I will add it in later as I have a working implementation from another library, then I will come back to this PR for review :)

Jul 27 '22 22:07 seanmor5

Note that constant! forces the input you're giving it to be a constant, so you can always convert it into a flat list

This does not work unfortunately. scales and sizes are not constant in the unit tests:

** (ArgumentError) unable to build model from ONNX graph, expected value scales to be constant value, but was :input

And when using the input! macro, the call to_flat_list fails, as scales is an Axon, no Tensor or number:

 ** (ArgumentError) expected a %Nx.Tensor{} or a number, got: 
     -------------------------------------------------------------------------------
                                          Model
     ===============================================================================
      Layer              Shape   Policy              Parameters   Parameters Memory
     ===============================================================================
      scales ( input )   {4}     p=f32 c=f32 o=f32   0            0 bytes
     -------------------------------------------------------------------------------
     Total Parameters: 0
     Total Parameters Memory: 0 bytes
     Inputs: %{"scales" => {4}}

Jul 31 '22 08:07 jnnks

@seanmor5 I need your wisdom :)

Aug 01 '22 17:08 jnnks

The values from scales and sizes are not known during deserialization and are only available during inference. So I am going for an Axon.layer with Axon.Layers.resize inside the lambda function. This does not work either, since I don't know how to get the output shape as a tuple instead of a Tensor.

(I also removed a lot of unimportant features that will be added again later, when the function actually works)

Aug 10 '22 09:08 jnnks

@jnnks If the tests depend on non-constant inputs and raise, then it's okay to ignore them in that case. For a lot of functions we cannot implement the tests because of this dependency on constants, a better test is probably an integration test which involves a model that uses a Resize

Aug 10 '22 23:08 seanmor5

It seems counter intuitive to me to use an industry standard and then deviate from it. Especially since the standard has well defined unit tests this seems strange to me.

I'll convert the predefined onnx models to use constant inputs so we can test against those.

Aug 11 '22 14:08 jnnks

We don't really have a choice, Nx does not support dynamic input shapes and so we need to enforce this constraint. In most cases it ends up being fine, there are only a few edge cases you encounter in practice where dynamic shapes are needed and for the most part they can be mitigated or refactored to meet a static shape requirement

Aug 11 '22 17:08 seanmor5

Looks like 'Axon.resize' does not deliver the results expected by the 'onnx' test case (test_resize_downsample_scales_nearest). I rebuilt one of the onnx model files to use constant inputs and it seems to work fine, however the test case fails because of mismatched outputs:

Input

#Nx.Tensor<
  f32[1][1][2][4]
  [
    [
      [
        [1.0, 2.0, 3.0, 4.0],
        [5.0, 6.0, 7.0, 8.0]
      ]
    ]
  ]
>

Scales

#Nx.Tensor<
  f32[4]
  [1.0, 1.0, 0.6000000238418579, 0.6000000238418579]
>

Output

     ** (RuntimeError) expected #Nx.Tensor<
       f32[1][1][1][2]
       [
         [
           [
             [6.0, 8.0]
           ]
         ]
       ]
     > to be within tolerance of #Nx.Tensor<
       f32[1][1][1][2]
       [
         [
           [
             [1.0, 3.0]
           ]
         ]
       ]
     >

...[1.0, 3.0]... is the output desired by onnx.

Since the scales are both 0.6 I would argue that onnx is expecting the wrong output. What are your thoughts on this @seanmor5 ?

Sep 19 '22 13:09 jnnks

@jnnks I think you might be right, but is there a parameter we're not handling that might affect the behavior here?

Sep 20 '22 23:09 seanmor5

Not as far as I can see, but I will check the spec again for implicit defaults, that may be causing it. Maybe it's a recurring pattern in the other tests as well.

Sep 28 '22 14:09 jnnks

We have two problems:

ONNX expects different values from what Axon.Layers.resize returns [1]
ONNX uses the scales and sizes as input values instead of constants in the graph [2]

To make the operator testable we need to do the following:

decode the test model and test data
- can be done using protobuf-compiler protoc --decode onnx.ModelProto onnx.proto < file_path > out_path.txt, (requires https://github.com/onnx/onnx/blob/main/onnx/onnx.proto in the same directory)
decode the test dataset
- can be done with OnnxTestHelper.pb_to_tensor(pb_path)
inject the scales/sizes into the decoded model text file [3]
reencode the model file also using protoc
run Axon.Layers.resize with the test data to get the expected resize result
reencode the test data, not sure how yet

[1] Different return values

input: #Nx.Tensor<
  f32[1][1][2][4]
  [
    [
      [
        [1.0, 2.0, 3.0, 4.0],
        [5.0, 6.0, 7.0, 8.0]
      ]
    ]
  ]
>

resize with: {"scales",
 #Nx.Tensor<
   f32[4]
   [1.0, 1.0, 0.6000000238418579, 0.6000000238418579]
 >}

expected_output: #Nx.Tensor<
  f32[1][1][1][2]
  [
    [
      [
        [2.6666665077209473, 4.3333330154418945]
      ]
    ]
  ]
>

actual_output: #Nx.Tensor<
  f32[1][1][1][2]
  [
    [
      [
        [4.199999809265137, 5.466666221618652]
      ]
    ]
  ]
>

[2] input values instead of constants

(ArgumentError) unable to build model from ONNX graph, expected value scales to be constant value, but was :input

[3] Nx.Tensor to Protobuf

input:
#Nx.Tensor<
   f32[4]
   [1.0, 1.0, 0.6, 0.6]
 >
output:
  initializer {
    dims: 4
    data_type: 1
    float_data: 1
    float_data: 1
    float_data: 0.6
    float_data: 0.6
    name: "scales"
  }

See: https://github.com/onnx/onnx/blob/4b2d50334914621835cc1e8dadd4fe82b6b9876c/onnx/onnx-ml.proto#L484

Mar 26 '23 08:03 jnnks

axon_onnx
axon_onnx copied to clipboard

Support for `op_type` "Resize"

Input

Scales

Output

axon_onnx axon_onnx copied to clipboard

Support for `op_type` "Resize"

Input

Scales

Output

axon_onnx
axon_onnx copied to clipboard