Jorn Tuyls
Jorn Tuyls
> Another issue I am having is that [this](https://github.com/MaheshRavishankar/iree/commit/46acfca7464561c361d446a11ad5c33a9a255b49#diff-1308306ae2802ec5bb022c31338cea3cd8dac61a9cb675541c9bd37009161e16) test currently segfaults with the following error > > ``` > LLVM ERROR: can't create Attribute 'mlir::iree_compiler::IREE::Codegen::EncodingNopLayoutAttr' because storage uniquer isn't...
An update on this: To get the above matmul prototype working, we need(ed) following changes/fixes: - https://github.com/iree-org/iree/pull/20969 - https://github.com/iree-org/iree/pull/20971 - https://github.com/iree-org/iree/pull/20845 -> This one is now ready to be reviewed...
> [@jtuyls](https://github.com/jtuyls) what are the padding amounts for these shapes? > > > 1024x128xf32, 2048x128xf32 and 4096x2048xf32 The padding amount is [0, 32], not on those parameters, but on the...
> So we only pad the LHS? > > %12 = iree_tensor_ext.dispatch.tensor.load %10, offsets = [0, 0], sizes = [%9, 2048], strides = [1, 1] : !iree_tensor_ext.dispatch.tensor
@kuhar, I looked at this with @MaheshRavishankar and we noticed that no mfma operations are being generated for the example as 'M' can be fully dynamic. After specifying that M...
Update: I manually adjusted the base IR and IRPA weights file to pad (some of) the weight parameters as well and this results in some speedup: | IR | Baseline...
> @jtuyls could you rerun this without pingpong? @kuhar These are all the e2e numbers with and without ping-pong (--iree-codegen-enable-default-tuning-specs=true): | Model | Baseline (ms) | Baseline tuned (ms) |...
> Another issue with the test IR from [#20835 (comment)](https://github.com/iree-org/iree/issues/20835#issuecomment-2937559669) is that the final matmul is `(f32, f32) -> (f32)` and won't use any of the efficient mfma instructions because...
> This PR adds a new memtile_repeat attribute to the ObjectFifoCreateOp. Why `memtile_repeat`? This should be applicable to cores as well, so it might be better to just call it...
> I think this works for our current use cases, but it seems dubious to me, since the padding computation could be different for the source and result of the...