cpu: aarch64: allow sbgemm config for matmul primitive

Open snadampal opened this issue 1 year ago • 2 comments

Description

Please include a summary of the change. Please also include relevant motivation and context. See contribution guidelines for more details. If the change fixes an issue not documented in the project's Github issue tracker, please document all steps necessary to reproduce it.

This is required to support precompiled graphs where the primitive gets created with the reordered (already reordered) weight tensors, so their formats are blocked and more custom.

allow additional blocked layout formats.
use bfloat16 fast math kernels from openxla.

Fixes # (github issue)

Checklist

General

[ ] Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit? ran make test and ./benchdnn --matmul --mode=P --engine=cpu --allow-enum-tags-only=0 --batch=inputs/matmul/test_matmul_ci
[ x] Have you formatted the code using clang-format?

Performance improvements

[ ] Have you submitted performance data that demonstrates performance improvements?

New features

[ ] Have you published an RFC for the new feature?
[ ] Was the RFC approved?
[ ] Have you added relevant tests?

Bug fixes

[ ] Have you included information on how to reproduce the issue (either in a github issue or in this PR)?
[ ] Have you added relevant regression tests?

RFC PR

[ ] Does RFC document follow the template?
[ ] Have you added a link to the rendered document?

Sep 01 '24 21:09 snadampal

yes, @dzarukin , I will ping here once the corresponding ACL change is merged.

Mar 12 '25 23:03 snadampal

Is this change still in progress? Or should we close this PR?

Dec 10 '25 10:12 michalowski-arm