hls4ml icon indicating copy to clipboard operation
hls4ml copied to clipboard

Add LayerNorm support for Vivado

Open rianbrooksflynn opened this issue 1 year ago • 11 comments

Description

This PR adds support for Layer Normalization using either Keras or PyTorch with the Vivado backend in io_parallel mode.

This implementation uses a lookup table for inverse square root; the inputs to the lookup table follow a logarithmic distribution for better accuracy.

Tests have been added for both Keras and Pytorch parsing.

Credit is due to @Ethan0Jiang and @LostEcho365 (Zhixing Jiang and Dennis Yin) for their Vivado implementation and Keras parsing support; my contributions were making a change to the inverse square root lookup table implementation, implementing PyTorch support, and adding unit tests. (Here's a link to their pre-print.) The original code authors have given permission for their code to be merged into hls4ml.

Linked issue: https://github.com/fastmachinelearning/hls4ml/issues/1109

Type of change

  • [x] New feature (non-breaking change which adds functionality)
  • [x] A new research paper code implementation

Tests

Two unit tests added: test/pytest/test_layernorm.py and test/pytest/test_layernorm_pytorch.py

Checklist

  • [x] I have read the guidelines for contributing.
  • [x] I have commented my code, particularly in hard-to-understand areas.
  • [ ] I have made corresponding changes to the documentation.
  • [x] My changes generate no new warnings.
  • [x] I have installed and run pre-commit on the files I edited or added.
  • [x] I have added tests that prove my fix is effective or that my feature works.

rianbrooksflynn avatar Nov 04 '24 16:11 rianbrooksflynn

The pytest failure is the QKeras tests timing out. There's 299 tests being run in that batch, which I guess is too many. Is there a why to reshuffle the batches to avoid the timeout?

JanFSchulte avatar Nov 04 '24 17:11 JanFSchulte

pre-commit.ci autofix

rianbrooksflynn avatar Dec 09 '24 17:12 rianbrooksflynn

For the "broken" diffs, we should look to see what the whitespace error is and fix it before merging.

jmitrevs avatar Dec 18 '24 20:12 jmitrevs

I did a bit of digging and it's not a whitespace problem but rather the file endings are improperly encoded. Likely the person we got the code from was using a Windows machine. @rianbrooksflynn, you can install the dos2unix package and just do dos2unix file.name to fix this.

JanFSchulte avatar Dec 18 '24 21:12 JanFSchulte

We should squash the commits when we merge this.

jmitrevs avatar Jan 09 '25 23:01 jmitrevs

Thanks @vloncar for your review and apologies that it took me until now to get around to implementing your feedback! I would appreciate a second look from you at this pull request.

rianbrooksflynn avatar Mar 11 '25 17:03 rianbrooksflynn

pre-commit.ci autofix

rianbrooksflynn avatar Mar 11 '25 17:03 rianbrooksflynn

Fixed the latest merge conflicts. @vloncar, can you check if Rian's fixes are enough to have this merged?

JanFSchulte avatar Apr 29 '25 19:04 JanFSchulte

Any update on the merge ?

The-Padi avatar Jun 05 '25 12:06 The-Padi

First in line on my PR review TODO list, I expect early next week to have time for this.

vloncar avatar Jun 05 '25 12:06 vloncar

First in line on my PR review TODO list, I expect early next week to have time for this.

Thank you very much !

The-Padi avatar Jun 05 '25 12:06 The-Padi

pre-commit.ci autofix

JanFSchulte avatar Jul 01 '25 16:07 JanFSchulte

@vloncar I hope this goes in the direction of what you had in mind for the performance validation. I ran synthesis in Vitis 2023.1, 2024.1, and 2025.1 for different input sizes to the LayerNorm and plotted FFs, LUTs, DSPs, BRAM, latency, and II as a function of that input size. 2024.1 and 2025.1 are basically identical, whereas 2023.1 uses a bit less resources but has worse latency.

This is the the default ap_fixed<16.6> and the default target part.

layerNorm_FFs layerNorm_LUTs layerNorm_DSPs layerNorm_BRAMs layerNorm_lats layerNorm_IIs

JanFSchulte avatar Aug 05 '25 14:08 JanFSchulte

Thanks, looks good. Do all reports say the timing is met? (No scheduling warnings etc, the clock uncertainty is met etc)

vloncar avatar Aug 05 '25 14:08 vloncar

I did not observe any warning about the scheduling. The timing results look all very similar: image

JanFSchulte avatar Aug 05 '25 15:08 JanFSchulte