oneDNN icon indicating copy to clipboard operation
oneDNN copied to clipboard

cpu: aarch64: enable jit conv for 128

Open jondea opened this issue 9 months ago • 3 comments

Description

Draft: needs some optimization tweaks, also deconv tests fail due to check_zero_padding

Naively enable 128 by copying the equivalent invocations to 512 and 256. Note that is_1stconv is hard coded to true for sve_128, which misses out on some performance.

Some more optimization is necessary, but this speeds up some cases, specifically backward. In some cases this was slower than Arm Compute Library (ACL), so unlike the 512 and 256 counterparts, it has been set below the ACL implementations in the CPU list.

Checklist

General

  • [ ] Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
  • [ ] Have you formatted the code using clang-format?

Performance improvements

  • [ ] Have you submitted performance data that demonstrates performance improvements?

jondea avatar Apr 04 '25 08:04 jondea

@kasturedeeksha is this a reasonable approach? (It's a draft, so I know there may be some code quality issues)

jondea avatar May 02 '25 14:05 jondea

Should help #2165

jondea avatar May 02 '25 14:05 jondea

@jondea Yes, this approach can be used to extend the work for 128-bit support, just need to verify that all tests are passing and check if anything more needs to be done, priority in cpu_convolution_list can be decided based on performance.

kasturedeeksha avatar May 07 '25 07:05 kasturedeeksha