Benoit Jacob

Results 95 comments of Benoit Jacob

I have this [old script](https://gist.github.com/bjacob/1f92a549d330ab716c21ba332c50a3c3) to break down TOSA ops/shapes... here is what it says about this model: ``` 1196 tosa.rescale 393 tosa.reshape 186 tosa.mul 99 tosa.add 74 tosa.sub 74...

Also, it's interesting that Ruy is being 2x faster than IREE even though Ruy only knows how to use the `+dotprod` extension, not `+i8mm`, implying it should actually be 2x...

Yes, avoid `+sve` and `+sve2` as they disable data-tiling ( @banach-space FYI ). No need to specify `+reserve-x18`, IREE does that internally. Thanks for gathering the detailed profiles above. Now...

> On several models, we have seen that when `DT+UK` is enabled and microkernels are not available, it ends up being 2-3x slower than `DT` on its own. This is...

Regarding data-tiling: At the moment, data-tiling sizes are set in https://github.com/openxla/iree/blob/14927d15c19710bcdd4d630e62b21428424d6ef6/compiler/src/iree/compiler/Codegen/Common/CPU/CPUMaterializeEncodingPass.cpp . There's nothing hard particularly about making this pluggable, we'd just need to understand precisely the dimensions of pluggability...

...So regarding #16259 specifically, I think that we could choose to actually just merge it as-is, or just leave out the arm64 part (that is the one part that's a...

@mariecwhite : - The CPUMaterializeEncodingPass change is exactly what I was referring to above regarding data-tiling. It really is about data-tiling, not ukernel. It's TBD how to make that pluggable...

We discussed for the the whole 30-min mai-tai meeting today. If I correctly understood what @stellaraccident said, the most significant bit was that we should talk less in terms of...

I've filed https://github.com/openxla/iree/issues/16401 to track blockers before we can enable the mmt4d ukernel by default on llvm-cpu.

> Indeed tracy uses the sources from `--iree-hal-dump-executable-sources-to=` but it also requires the intermediates from IREE_PRESERVE_DYLIB_TEMP_FILES=your/path. The dylib temp files that `IREE_PRESERVE_DYLIB_TEMP_FILES` preserves are final binaries, not intermediates. The intermediates...