Description

Related to #859
[ ] Add a detailed description of the changes
Added const to the depth variable of the FIFOs between the Depthwise and the Pointwise convolutions to fix a Vitis HLS (2023.2) warning

Type of change

[x] New feature (non-breaking change which adds functionality)

Tests

Apart from the numerical correctness in the pytests comparing Keras to hls4ml, executed successfully the c-synthesis
Checked the numerical correctness and c-synthesis of a 10k parameter UNet that has 20 Separable 2D Convolutions

Checklist

[x] I have read the guidelines for contributing.
[ ] I have commented my code, particularly in hard-to-understand areas.
[ ] I have made corresponding changes to the documentation.
[x] My changes generate no new warnings.
[x] I have installed and run pre-commit on the files I edited or added.
[x] I have added tests that prove my fix is effective or that my feature works.

Oct 11 '24 15:10 steltze

Some results from vivado synthesis

Oct 11 '24 15:10 steltze

Thanks. I should mention that there is also an effort to develop depthwise (and seperable, which gets split) for oneAPI, which will replace the Quartus backend eventually. It's in a PR to the oneAPI PR.

Oct 11 '24 15:10 jmitrevs

Updated to the latest main branch since I think it fixes the issue that caused the test failure.

Oct 16 '24 19:10 jmitrevs

Pipelining over the kernel size might produce better QoR. Pending to test this hypothesis

Oct 17 '24 19:10 steltze

What is blocking this PR to be merged?

Nov 21 '24 10:11 nghielme

@nghielme It is still marked as a draft, so no one took the time to thoroughly review it :wink:. Also, it will come after 1.0.0

Nov 21 '24 14:11 vloncar

With this patch my Vivado HLS 2019.2 fails with following output, regardless of resource or latency mode:

ERROR: [HLS 200-70] '#pragma HLS ARRAY_PARTITION variable=&acc type=complete' is not a valid pragma.
ERROR: [HLS 200-70] '#pragma HLS ARRAY_RESHAPE variable=&weights type=block factor=block_factor' is not a valid pragma.
ERROR: [HLS 200-70] '#pragma HLS ARRAY_RESHAPE variable=&data type=block factor=block_factor' is not a valid pragma.
ERROR: [HLS 200-70] '#pragma HLS ARRAY_PARTITION variable=&acc type=complete' is not a valid pragma.
ERROR: [HLS 200-70] '#pragma HLS ARRAY_PARTITION variable=&acc type=complete' is not a valid pragma.

The issue can be solved by removing type= from these pragmas as done elsewhere in the code base.

Jan 01 '25 23:01 xtreme8000

thanks @xtreme8000! fixed

Jan 09 '25 15:01 steltze

The only thing I see lacking here is the use of function pointers for depnse instead of the current if (strategy == latency) do_latency_dense() else do_resource_dense() approach. We moved the rest of the codebase towards that model as it allows the dense function to be generated and implemented in special ways (HGQ and resource_unrolled use this). I've made the changes needded and put them in vloncar/stelios_depthwise_resource.

After extensive testing of both approaches, I've made a few observations:

In some rare instances function pointer adds a clock cycle to the main loop of depthwise function. I didn't manage to figure out why, the schedule viewer shows both approaches resulting in identical sequence of operations, but with function pointer one having one operation after depthwise_product function scheduled in the next cycle for no apparent reason. I'm thinking to follow this up with AMD.
Sometimes the 3rd case (rf > n_chan and rf % n_chan != 0) fails to meet timing. This is regardless of the approach used. We could consider enforcing RF that avoids it
Depending on the size of the layer/input, the pragma HLS INLINE recursive just prior to calling depthwise_resource may have a detrimental effect (going from added cycles to missing timing). Again, regardless of the approach. FWIW, this also affects regular Conv1D/2D.

Since the issues are specific to a use case, I'm thinking to merge the function pointer changes into this and then merge it to main. (there's also a trivial change to tests to avoid creating directory within a directory.) If the issue appears, it is trivial to manually tweak around it to avoid the use of function pointer and/or tweak the inline pragma. The 3rd case can also be avoided by the user if they observe that issue. If no one objects I'll do this tomorrow.

Feb 24 '25 23:02 vloncar

This is now ready. Let's just wait to see if I messed up the tests before we merge.

Feb 26 '25 20:02 vloncar

Depthwise 1D and 2D Resource strategy for io_stream

Description

Type of change

Tests

Checklist