Arthur Araujo Mitrano
Arthur Araujo Mitrano
I was able to reproduce it with d279b39. We have a fix for it already it should be out soon. As an workaround you you can add `-DCMAKE_BUILD_TYPE=Release` to the...
You are correct, that is the default build mode. This is build issue related. Even though that's the default build mode, it seems s8s8 gemm (`mkl-dnn/src/cpu/gemm/s8x8s32/simple_gemm_s8s8s32.cpp`) file was not being...
The difference was approximately 3x slower when not using optimizations. If both builds have optimizations I don't see significant difference in my system. In other words I can't reproduce it....
@guillaumekln I was able to reproduce it on a i7-6700K. There seems to be 2 issues, one with the build system and one with s8s8 gemm. Thank you for your...
I did a quick investigation. It seems this is a real issue, but not so simple to fix it since it depends on how memory is manage in DNNL. As...
Hi @ppetrushkov Difference seems small. Is it possible to provide a reproducer?
Hi @ppetrushkov - Thank you very much for the reproducer. `dnnl_gemm_u8s8s32` doesn't pre-pack matrices. So we are paying the cost of packing at each call. We have a pack-api where...
@Ryo-not-rio - Issue should be addressed. Would it be possible for you to check on your side?
@penpornk - Thanks for the reproducer we can reproduce same output on our side. We are taking a look. Tagging @dzarukin to keep him in the loop.
From gemm side it would be non-trivial to guarantee same order of computations and it would have performance implications. Not only tail handling would have to be changed, but we...