Jorn Tuyls

Results 10 issues of Jorn Tuyls

In case of multiple ObjectFifo consumers with a single implicit `fromStream` BD layout attribute, the `consumerDims` array can contain a single element and `consumerDims[consumerIndex]` can be an out-of-bounds access. Weird...

Towards: https://github.com/iree-org/iree/issues/20835, needed to get e2e Llama3 with padding to compile. This PR avoids inserting padding encodings if the producer dispatch region contains an attention operation as that results in...

Sometimes when we add pad encodings, the blocked dimensions can make it hard to fold the `tensor.collapse_shape` and the `iree_tensor_ext.dispatch.tensor.store` operations, for example: ``` %collapsed = tensor.collapse_shape %36 [[0], [1,...

Towards: https://github.com/iree-org/iree/issues/20835 Supporting dynamic dimensions in the `tensor.collapse_shape` into `iree_tensor_ext.dispatch.tensor.store` folding is needed to support pad encodings in e2e Llama3 as the remaining collapses will otherwise result in `linalg.copy` operations...

This PR makes sure that the encoding attribute is included in the preferred storage type of a `HoistableTensorType`. Without this we can get a `tensor.bitcast` on an encoded type returning...

Creating an issue to discuss and track the scheduling of 'inner-loop ukernels' to replace the existing MLIR ukernels in most workloads. From @MaheshRavishankar on discord (https://discord.com/channels/689900678990135345/1254843174111678555/1443009208181063805): > I wanted to...

When we enable data-tiling and the set encodings on parameters get hoisted into initializers, this results in double the memory footprint, iiuc from loading the original parameters and the output...

To enable data-tiling on llama 3.1 405b we need a couple of new features/fixes so creating an issue to track the sub-tasks/progress and discuss performance numbers once we get there....

This enables the MLIR tensor ukernels by default as they provide the best performance. All benchmark/production workloads I am aware of seem to set this flag to true already, so...