simveit

Results 35 comments of simveit

Hello @prateekshukla1108 you can see the whole setup in the above gist.

I don't think it's related to that. A naive version of the kernel using same way of assigning thread layout but no SMEM works as expected See [here](https://gist.github.com/simveit/7ba23f8a5d865c9d376a7fd313d17bf4)

i will take a look. i need to study cute layouts closer. thanks @lijingticy22

@ehsanmok I adressed your issues and included a benchmark at the end. I did allocation of shared memory via `stack_allocation` to be concise.