triton icon indicating copy to clipboard operation
triton copied to clipboard

[GLUON] coalesced layout

Open hgl71964 opened this issue 1 month ago • 6 comments

This is the initial version for efficient layout. Would like to hear your thoughts. There are lots of code duplication from resolveAutoEncoding pass. Should I factor out the code for reuse?

hgl71964 avatar Oct 31 '25 15:10 hgl71964

Not a reviewer but we have a need for a similar thing in our code (a rough version is here: https://github.com/sublinear-systems/triton-utils/blob/dd107ff4f0cad3cfcc6ca017c2cf040b3f013b73/utils.py#L244). We find that finding a layout with maximal coalescing and then tiling it to cover all dimensions is the best for us, although in some cases we request a "priority" dimension because we'll do a reduction across it. Do you think this is worth considering for what you're working on? Also, bikeshedding, I will point out that a lot of layouts are "efficient" depending on the context so this is really a "coalesced layout" or similar :)

saagarjha avatar Oct 31 '25 17:10 saagarjha

cc @lezcano for the layout question.

+1 for CoalescedLayout. Indeed that seems more precise here, and would leave more space for future developments.

peterbell10 avatar Oct 31 '25 17:10 peterbell10

Can you give some context of this work? What is meant by "efficient" here, and why is this specific to Gluon?

masahi avatar Oct 31 '25 19:10 masahi

In Gluon, layouts have to picked manually. For certain Op, e.g. load/store, we actually know the "efficient" layout, so the motivation is to implement such a mechanism. But sure, "coalesce" layout sounds better, and the python-side solution seems interesting

hgl71964 avatar Oct 31 '25 22:10 hgl71964

Code reviews are addressed except for the refactoring part. Could you take a look? @peterbell10

hgl71964 avatar Nov 06 '25 21:11 hgl71964

refactor and tests are finished @peterbell10. cc @lezcano

hgl71964 avatar Nov 08 '25 16:11 hgl71964