hls4ml
hls4ml copied to clipboard
Remove unnecessary transposes related to conversion to channels_last format
Description
The current channels_last converter inserts a transpose node after the "flatten" layer to ensure the order is correct for the subsequent fully connected layer. This isn't strictly required and can be costly, e.g., for 2D convolutional networks this results in a transpose3d HLS function to be used, which is very expensive.
Additionally, in cases where input has only one channel a transpose isn't required. Technically one can get around this with inputs_channel_last=True
but we've seen people expecting not to use this feature in case of a single channel.
This PR adds two more optimizers that run after the main channels_last optimizer to remove the transposes. This is more straightforward than to add special cases to the main optimizer to exclude insertion of Transpose layers.
Type of change
- [x] New feature (non-breaking change which adds functionality) - Optimization to be precise
Tests
There's a new test called test_remove_transpose
in test_pytorch_api.py
that triggers this. Additionally, the removal of transpose after flatten is triggered by test_skipped_layers
.
Checklist
- [x] I have read the guidelines for contributing.
- [x] I have commented my code, particularly in hard-to-understand areas.
- [x] I have made corresponding changes to the documentation.
- [x] My changes generate no new warnings.
- [x] I have installed and run
pre-commit
on the files I edited or added. - [x] I have added tests that prove my fix is effective or that my feature works.