oneDNN icon indicating copy to clipboard operation
oneDNN copied to clipboard

cpu: aarch64: matmul: addition of JIT bf16 kernel

Open vishwascm opened this issue 6 months ago • 1 comments

Description

Addition: 2D Matmul bf16 kernel to oneDNN with sum post ops.

Major Code Changes: • Addition of jit_bf16_matmul.cpp • Addition of jit_bf16_matmul.hpp Other minor changes added accordingly in some files.

Checklist

General

[✓] Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit? Yes

1. Gtest: ./test_matmul

[----------] Global test environment tear-down
[==========] 64 tests from 8 test suites ran. (42 ms total)
[  PASSED  ] 52 tests.
[  SKIPPED ] 12 tests, listed below:
[  SKIPPED ] TensorDims/attr_test_t.TestMatmulShouldCallSameImplementationWithAttributes/0
[  SKIPPED ] TensorDims/attr_test_t.TestMatmulShouldCallSameImplementationWithAttributes/1
[  SKIPPED ] TensorDims/attr_test_t.TestMatmulShouldCallSameImplementationWithAttributes/2
[  SKIPPED ] TensorDims/attr_test_t.TestMatmulShouldCallSameImplementationWithAttributes/3
[  SKIPPED ] TensorDims/attr_test_t.TestMatmulShouldCallSameImplementationWithAttributes/4
[  SKIPPED ] Generic_f16/iface.TestsMatMul/0
[  SKIPPED ] Generic_f16/iface.TestsMatMul/1
[  SKIPPED ] Generic_f16/iface.TestsMatMul/2
[  SKIPPED ] Generic_f16/iface.TestsMatMul/3
[  SKIPPED ] Generic_f16/iface.TestsMatMul/4
[  SKIPPED ] Generic_f16/iface.TestsMatMul/5
[  SKIPPED ] Generic_f16/iface.TestsMatMul/6 

2. make test

96% tests passed, 8 tests failed out of 224

Total Test time (real) = 1291.56 sec

The following tests FAILED:
         72 - test_iface_attr (Failed)
        113 - test_internals_env_vars_dnnl (Failed)
        138 - test_graph_unit_interface_graph_cpu (Failed)
        172 - test_graph_unit_dnnl_large_partition_cpu (Failed)
        174 - test_graph_unit_dnnl_matmul_cpu (Failed)
        195 - test_benchdnn_modeC_binary_ci_cpu (Failed)
        196 - test_benchdnn_modeC_binary_different_dt_ci_cpu (Failed)
        204 - test_benchdnn_modeC_graph_ci_cpu (Failed)
Errors while running CTest
Output from these tests are in: /home/vishwas/oss/viswas/oneDNN/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
make: *** [Makefile:71: test] Error 8
  1. benchdnn tests

./benchdnn --matmul --batch=inputs/matmul/test_matmul_bfloat16

tests:7835 passed:7825 skipped:10 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:0 listed:0
total: 36.01s; create_pd: 0.24s (1%); create_prim: 0.40s (1%); fill: 11.08s (31%); execute: 6.65s (18%); compute_ref: 6.37s (18%); compare: 6.38s (18%);

./benchdnn --matmul --batch=inputs/matmul/test_matmul_all

tests:37359 passed:29520 skipped:7679 mistrusted:160 unimplemented:0 invalid_arguments:0 failed:0 listed:0
total: 208.67s; create_pd: 1.15s (1%); create_prim: 2.29s (1%); fill: 62.90s (30%); execute: 58.87s (28%); compute_ref: 30.08s (14%); compare: 29.87s (14%);

Clang formatting done

vishwascm avatar May 28 '25 12:05 vishwascm

Thanks @jondea, for the the review. I will go through your queries and reply soon.

vishwascm avatar Jun 03 '25 11:06 vishwascm