Description

This patch adds jit implementation of brgemm for AArch64 SVE512.

Please merge after applying the following pull requests. #1461 #1462

Checklist

General

[x] Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
[x] Have you formatted the code using clang-format?

Sep 28 '22 19:09 Koji-Kurihara

Thank you for your comment.

I think you forgot to add the inner_product implementation to cpu_inner_product_list.cpp.

I have added the inner_product implementation to cpu_inner_product_list.cpp.

Also could you clarify why you are keeping the whole AMX logic from x64? One of the goodness of the brgemm approach is that it should allow you to "just" reimplement the [brgemm API ](https://github.com/oneapi-> src/oneDNN/blob/master/src/cpu/x64/brgemm/brgemm.hpp)in the best way possible for your arch, and then reuse the same driver across architecture. I think this would simplify the whole brgemm support on the ARM side.

I temporarily left the AMX implementation to eliminate build errors. Since the ARM side does not use the AMX implementation, I removed the AMX implementation.

Nov 10 '22 01:11 Koji-Kurihara

Hi @Koji-Kurihara ,

There were recent changes to quantization support described in this RFC: 20220201-quantization-scaling

Please review the x86 version of brgemm and update aarch64 version to make it work for int8.

The main changes are:

Now scaling relies on per argument scales instead of a single output scale
- To reuse brgemm in other primitives currently it still takes a single pointer to scales, but a primitive that uses brgemm should precompute that scale by multiplying src_scale and wei_scale.
- dst scale is not yet supported for brgemm kernels
Bias is aplied after dequantization (after appling input scales) and doesn't need to be quantized by user
Scaling is runtime only, so scale factors are passed at execution time as separate memory objects and they are not known at creation time.

Nov 17 '22 19:11 igorsafo

Hi @Koji-Kurihara @kawakami-k , Do you plan to make any additional changes to this PR?

Jan 10 '23 00:01 igorsafo

Hi @igorsafo , I'm sorry. I have not been able to work on this issue due to lack of time. We will work on this issue before the Branch cutoff in v3.1.

Jan 11 '23 00:01 Koji-Kurihara

@Koji-Kurihara Thanks for the update!

Jan 11 '23 02:01 igorsafo

Hi @igorsafo, I am developing brgemm to work with int8. However, it will take more time to complete. So, could you please merge the current source code implementing brgemm that works with f32? I will send out another pull request for brgemm that works with int8 as soon as I finish development.

Feb 20 '23 05:02 Koji-Kurihara

Maybe I am missing something, but I don't see the changes mentioned in your comments (removing amx_tileconfigure headers, printf, ...). Did you push those changes?

Feb 27 '23 09:02 mgouicem

Hi @mgouicem, Can you see bcd4f7d? This commit ID contains those changes.

Feb 28 '23 08:02 Koji-Kurihara

Hi @mgouicem, Thank you for your comment. I updated the copyright header and removed redundant descriptions. a796e94 contains these changes.

Mar 01 '23 01:03 Koji-Kurihara

One last thing: please fix formatting using clang-format.

Mar 01 '23 09:03 mgouicem

Thank you for your comment. I fixed the formatting using clang-format in f4e74db.

Mar 02 '23 02:03 Koji-Kurihara

I'm modifying SVE512 implementation to match the latest x64 implementation. Please wait a week.

May 16 '23 23:05 kawakami-k

Hello @igorsafo @mgouicem @densamoilov, I have raised new PR for the same brgemm jit implementation support. Link to PR: https://github.com/oneapi-src/oneDNN/pull/1818 Requesting you to please close this PR. Thanks

Mar 01 '24 10:03 vineelabhinav

Link to the new version: https://github.com/oneapi-src/oneDNN/pull/1818

Mar 01 '24 17:03 igorsafo

oneDNN
oneDNN copied to clipboard

aarch64: brgemm: add jit impl. for sve_512

Description

Checklist

General

oneDNN oneDNN copied to clipboard

aarch64: brgemm: add jit impl. for sve_512

Description

Checklist

General

oneDNN
oneDNN copied to clipboard