Siddhartha Menon
Siddhartha Menon
Would appreciate a review from @uxlfoundation/onednn-arch, thanks.
@jondea Removed the block. You can merge when you're ready.
Closing this as abandoned.
@mgouicem Thanks for the summary. I think that is an accurate recap of the main points. We will discuss internally and see if there is a balance that can be...
@Radu2k can this PR be closed as stale?
@Shreyas-fuj @kasturedeeksha This can happen with the 256-bit kernels too, just on larger shapes. ``` $ ./build/tests/benchdnn/benchdnn -v5 --conv --dt=bf16 --attr-post-ops=gelu_tanh+gelu_erf g1mb1ic1000ih1000iw1000oc1000oh1000ow1000kh2kw2sh1sw1ph0pw0dh0dw0 create: --conv --dt=bf16:bf16:bf16 --attr-post-ops=gelu_tanh+gelu_erf g1mb1ic1000ih1000oc1000oh1000kh2ph0 oneDNN implementation: brgconv:sve_256...
> 1024x1024:1024x1024: brg = 0.625, acl = 0.393 Any idea why this shape seems to behave so differently from the rest? It makes me wonder how the acl impl would...
> If you mean why does brgemm become slower than acl:gemm here, it's because brgemm currently uses bfdot instruction instead of bfmmla. It seems that makes the difference from 512:512:512...
Thanks for the patch @zhili03. Could you please resolve the [formatting failures](https://github.com/oneapi-src/oneDNN/actions/runs/13771178709/job/38509886814?pr=2849)? Also nit: we usually prefix the aarch64 changes' commit messages with `cpu: aarch64` rather than just `cpu:`. This...
@vpirogov Yes, still in review