hls4ml
hls4ml copied to clipboard
MLPerf Tiny developments
Not to be merged but simply to open discussion about which developments we should cherry-pick and rebase to hls4ml main branch. Based on a quick look, I can imagine a few separate PRs:
- [x] FIFO depth profiling @nicologhielmetti #509
- [x] ReLU merge optimizer @anmeza @oliviaweng #586
- [x] QDenseBatchnorm support @julesmuhizi #718 (requires the corresponding QKeras PR to be merged google/qkeras#74)
- [ ] Arty support for VivadoAccelerator backend @GiuseppeDiGuglielmo
- [ ] Ultra96 support for VivadoAccelerator backend @GiuseppeDiGuglielmo #646 (to be split up)
- [ ] AXI Master interface for VivadoAccelerator backend @GiuseppeDiGuglielmo #646 (to be split up)
- [ ] Reprogrammable weights for VivadoAccelerator backend @GiuseppeDiGuglielmo #646 (to be split up)
Hi @jmduarte, all the updates sounds good to me. I know @nicologhielmetti has begun the work to port the FIFO depth optimization to an optimizer pass on top of the current hls4ml framework.
For the "ReLU merge optimizer" do you have any numbers for the savings (latency or resources) that result?
@thesps Here is a table showing the resource utilization (estimated via csynth) before and after applying the "ReLU merge optimization" to an image classification model used for the tinyMLPerf submission.