Marek Michalowski
Marek Michalowski
@kasturedeeksha the issue seems to be that benchdnn estimates the problem size incorrectly for brgconv:sve_256. When running the x86 impl, benchdnn estimates the problem to require 17.7 GB of memory...
Just a heads up this is now giving incorrect results when running u8/s8 brgemm, eg.: `./tests/benchdnn/benchdnn --brgemm --dt=u8:s8:u8 13x192:192x32_n"int8:no_tail:21"` (from test_benchdnn_modeC_brgemm_ci_cpu), which gives expected results with sve_256 but not sve_128....
> > 1024x1024:1024x1024: brg = 0.625, acl = 0.393 > > Any idea why this shape seems to behave so differently from the rest? If you mean why does brgemm...
As this change is pretty big, do you think it would be possible to neatly split it into two commits: one for the sve optimizations and one for the asimd...
> Also, can we make sure this is tested in CI somehow? As it is right now, building with `DNNL_EXPERIMENTAL_UKERNEL` is going to use the external brgemm api for `./benchdnn...
> Now that the API is enabled, I expected the example `examples/ukernels/cpu_brgemm.cpp` to be functional. Still, when running it after enabling `-DDNNL_EXPERIMENTAL_UKERNEL=ON`, it returns the error `Kernel is not supported...
@dzarukin we need a review from the onednn-arch team, could you have a look please? Thanks!
Is this change still in progress? Or should we close this PR?