Jakub Kuderski

Results 83 comments of Jakub Kuderski

> I think the scheduling barrier has existed for a decent amount of time that the ROCm side targets we cared about (CDNA and RDNA included) all have it. This...

> Having went through this useful discussion, I tend to apply the original proposal of using a straightforward conditional check, and maybe a TODO for us to come back and...

> I can open a new ticket to address that and we can continue to discuss separately so I don't end up to be the guy that disrupt iree coding...

@Muzammiluddin-Syed-ECE make sure you click the 'Resolve conversation' button under comments you believe are addressed. This makes it much easier to iterate on pull requests.

@nirvedhmeshram We can add a few more. Do you have some specific shapes you are interested in?

Ah, interesting, could you add this shape to iree-kernel-benchmark? We can tag it as 'corner_case'.

> This is ready for review, but is kind of large. If reviewers prefer I can split this further by pass (I was just being lazy about figuring out how...

> #iree_gpu.sparse_mma_layout I'm not sure if we need that -- from the POV of IREE, our codegen will target dense MMA -- would it be possible to leave the element...

Ideas from yesterday's discussion with @Groverkss: 1. New parallel partial reduction dimension should be outermost in the output 2. Split-k subgroup on reduction dimension for tall skinny gemm 3. Expanding...

Raw notes about lowering_config semantics for matvecs: https://gist.github.com/Groverkss/015ed5af8db6e804bdf560fc35db1d4f --- ``` WORKGROUP = [] partial_reduction = [] subgroup_basis --> numSubgroups[dim] thread_basis --> numThreads[dim] thread --> vectorSize[dim] dim -> workgroup/partial_reduction numSubgroups[dim] *...