Milos Puzovic
Milos Puzovic
@elfringham, thanks for your fix. > @milpuz01 Do these features still need to be disabled for the best performance with ACL? Yes, they are still needed. Unfortunately, we should have...
We are about to start the exploratory work for 24.05 to integrate two oneDNN primitives convolution and matrix multiplication to use existing non-public ACL API for using stateless object in...
Hi @mdfaijul, as @elfringham identified the failure comes in `Setup()` method in `MklMatMulPrimtive` class when you are trying to set `b_mem` in `context_` by reading weight description from `prim_desc` that...
> @milpuz01, suppose you've also kicked off some tests for ARM platforms as per previous email communication. Could you share any test result if available? Hi @Guobing-Chen, we still do...
@malfet @atalman On Arm side @murste01 is setting up the same workflow as used by Intel to validate performance of oneDNN. He will take this discussion with @Guobing-Chen and @Xia-Weiwen...
INT8 kernels are using MMLA instead of MLAs that FP32 and FP16 using and they work core/memory system much harder. As a result with MMLA we can get anywhere between...
Hi @alvoron, > We'll have a look at the code in OneDNN. Since oneDNN 3.6 we are using stateless operators from ACL so in implementation of oneDNN `matmul` primitive we...