[GLUON] coalesced layout
This is the initial version for efficient layout. Would like to hear your thoughts. There are lots of code duplication from resolveAutoEncoding pass. Should I factor out the code for reuse?
Not a reviewer but we have a need for a similar thing in our code (a rough version is here: https://github.com/sublinear-systems/triton-utils/blob/dd107ff4f0cad3cfcc6ca017c2cf040b3f013b73/utils.py#L244). We find that finding a layout with maximal coalescing and then tiling it to cover all dimensions is the best for us, although in some cases we request a "priority" dimension because we'll do a reduction across it. Do you think this is worth considering for what you're working on? Also, bikeshedding, I will point out that a lot of layouts are "efficient" depending on the context so this is really a "coalesced layout" or similar :)
cc @lezcano for the layout question.
+1 for CoalescedLayout. Indeed that seems more precise here, and would leave more space for future developments.
Can you give some context of this work? What is meant by "efficient" here, and why is this specific to Gluon?
In Gluon, layouts have to picked manually. For certain Op, e.g. load/store, we actually know the "efficient" layout, so the motivation is to implement such a mechanism. But sure, "coalesce" layout sounds better, and the python-side solution seems interesting
Code reviews are addressed except for the refactoring part. Could you take a look? @peterbell10
refactor and tests are finished @peterbell10. cc @lezcano