hls4ml
hls4ml copied to clipboard
Fix conv1d stream implementation hls directives
Description
Fixed compiler directives for conv1d io_stream implementation. A loop was given the unroll directive, when it should have been pipelined. This loop is used to shift the kernel data when a new input is read by the stream. The new directives match those used in the conv2d implementation.
Type of change
- [x] Bug fix (non-breaking change that fixes an issue)
Tests
The existing tests for conv1d still pass
Test Configuration:
To see the improvements of the fix, I just ran synthesis on a large conv1d layer. The layer's parameters in this test are 8 channels, 16 filters, length 4096, with a kernel size of 7 and a stride of 2. The fix reduces the latency from 635,968 cycles to 53,342 cycles when using a reuse factor of 1 and resource strategy.
Checklist
- [x] I have read the guidelines for contributing.
- [x] I have commented my code, particularly in hard-to-understand areas.
- [x] I have made corresponding changes to the documentation.
- [x] My changes generate no new warnings.
- [x] I have added tests that prove my fix is effective or that my feature works.B
Thanks Jonathan. Can you add the before and after synthesis results to demonstrate the change?
I attached the synthesis printouts for the example conv layer before and after the change. I also attached a python file used to construct the model in case it is of interest. conv_stream_update.zip