qonnx
qonnx copied to clipboard
Transformations should return whether a model was modified or not
Details
Currently a transformation on a model in looks like this:
model = model.transform(MoveChanLastUpstream())
The executed transformation, in this case MoveChanLastUpstream
, may change the model, or it may not. In some cases it would be good to know if a transformation had any effect on the model. As example in the channelsLast transformation.
Here I need to check if the transformation has converged to then stop iterating over the model. Currently this is solved, by converting the ONNX model to a sting at the beginning of the loop: https://github.com/fastmachinelearning/qonnx/blob/feature/qonnx_chan_last/src/qonnx/transformation/channelsLast.py#L46
And then checking at the end if anything changed: https://github.com/fastmachinelearning/qonnx/blob/feature/qonnx_chan_last/src/qonnx/transformation/channelsLast.py#L69
And while this works fine it might incur large compute and memory penalties for larger models.
New behavior
When the ModelWrapper class is ported over to QONNX it would be nice if the behavior of the transform
member function could change.
In particular it would be nice if the function could optionally return whether a model was modified. As example like this:
model, modified = model.transform(MoveChanLastUpstream(), return_was_modified=True)
model = model.transform(MoveChanLastUpstream(), return_was_modified=False)
With the default set as return_was_modified=False
, this would keep the interface stable.
Motivation
This change would simplify development and make parts of QONNX faster. In particular the channelsLast transformation could benefit form this change: https://github.com/fastmachinelearning/qonnx/blob/feature/qonnx_chan_last/src/qonnx/transformation/channelsLast.py#L34
Parts of QONNX being affected
This change would affect all transformations. In particular all transformations would then be required to return a boolean flag to indicate whether they modified the model or not.
After having done some testing on how the current method of string comparison performs it seems like the actual performance impact is smaller than I had thought.
- As a potential worst case I had a look at the performance cost, when working with GPT-2:
- Used additional memory during string comparison: 1046 MB
- Time for string serialization and comparison: 478 ms
- Model source: https://github.com/onnx/models/blob/master/text/machine_comprehension/gpt-2/model/gpt2-10.onnx
- For a smaller model, like ResNet-50 it looks like this:
- Used additional memory during string comparison: 205 MB
- Time for string serialization and comparison: 109 ms
- Model source: https://github.com/onnx/models/blob/master/vision/classification/resnet/model/resnet50-v1-7.onnx
- And for the FINN test convolutional network (CNV_W2A2):
- Used additional memory during string comparison: 11 MB
- Time for string serialization and comparison: 6 ms
Since the comparison method via string serialization turns out to be cheap it might be interesting if we could introduce this as a comparison operator between ModelWrapper
instances.