qonnx icon indicating copy to clipboard operation
qonnx copied to clipboard

Transformations should return whether a model was modified or not

Open heborras opened this issue 3 years ago • 1 comments

Details

Currently a transformation on a model in looks like this:

model = model.transform(MoveChanLastUpstream())

The executed transformation, in this case MoveChanLastUpstream, may change the model, or it may not. In some cases it would be good to know if a transformation had any effect on the model. As example in the channelsLast transformation. Here I need to check if the transformation has converged to then stop iterating over the model. Currently this is solved, by converting the ONNX model to a sting at the beginning of the loop: https://github.com/fastmachinelearning/qonnx/blob/feature/qonnx_chan_last/src/qonnx/transformation/channelsLast.py#L46 And then checking at the end if anything changed: https://github.com/fastmachinelearning/qonnx/blob/feature/qonnx_chan_last/src/qonnx/transformation/channelsLast.py#L69 And while this works fine it might incur large compute and memory penalties for larger models.

New behavior

When the ModelWrapper class is ported over to QONNX it would be nice if the behavior of the transform member function could change. In particular it would be nice if the function could optionally return whether a model was modified. As example like this:

model, modified = model.transform(MoveChanLastUpstream(), return_was_modified=True)
model = model.transform(MoveChanLastUpstream(), return_was_modified=False)

With the default set as return_was_modified=False, this would keep the interface stable.

Motivation

This change would simplify development and make parts of QONNX faster. In particular the channelsLast transformation could benefit form this change: https://github.com/fastmachinelearning/qonnx/blob/feature/qonnx_chan_last/src/qonnx/transformation/channelsLast.py#L34

Parts of QONNX being affected

This change would affect all transformations. In particular all transformations would then be required to return a boolean flag to indicate whether they modified the model or not.

heborras avatar Sep 01 '21 09:09 heborras

After having done some testing on how the current method of string comparison performs it seems like the actual performance impact is smaller than I had thought.

  • As a potential worst case I had a look at the performance cost, when working with GPT-2:
    • Used additional memory during string comparison: 1046 MB
    • Time for string serialization and comparison: 478 ms
    • Model source: https://github.com/onnx/models/blob/master/text/machine_comprehension/gpt-2/model/gpt2-10.onnx
  • For a smaller model, like ResNet-50 it looks like this:
    • Used additional memory during string comparison: 205 MB
    • Time for string serialization and comparison: 109 ms
    • Model source: https://github.com/onnx/models/blob/master/vision/classification/resnet/model/resnet50-v1-7.onnx
  • And for the FINN test convolutional network (CNV_W2A2):
    • Used additional memory during string comparison: 11 MB
    • Time for string serialization and comparison: 6 ms

Since the comparison method via string serialization turns out to be cheap it might be interesting if we could introduce this as a comparison operator between ModelWrapper instances.

heborras avatar Sep 01 '21 14:09 heborras