[FEA] Can I use copy to store register value into shared memory?
Now I have tCrC, and I want to store them into shared memory. Can copy function do that? Thanks!
I mean, using CuTe.
If you've created your tCrC using some partitioner, then that same partitioner should be applied to an smem tensor. For example, we usually see this pattern:
Tensor gC = ... // (BLK_M,BLK_N)
Tensor sC = make_tensor(make_smem_ptr(ptr), shape(gC)); // (BLK_M,BLK_N)
auto thr_mma = tiled_mma.get_slice(thread_idx);
Tensor tCgC = thr_mma.partition_C(gC); // (MMA,MMA_M,MMA_N)
Tensor tCsC = thr_mma.partition_C(sC); // (MMA,MMA_M,MMA_N)
Tensor tCrC = thr_mma.make_fragment_C(tCgC); // (MMA,MMA_M,MMA_N)
...
copy(tCrC, tCgC);
// or
copy(tCrC, tCsC);
copy(tCsC, tCgC);
// or
copy(tCrC, tCsC);
if (threadIdx.x == 0) {
copy(sC, gC); // Copy the whole tile
}
// or re-partition sC and gC, etc
Which should, of course, also work with any shared memory layout for sC so long as its shape is still (compatible with) BLK_M x BLK_N.
Thank you very much for your reply!!!!
I noticed you are using "auto thr_mma = tiled_mma.get_slice(thread_idx);" So what is its difference with: "auto tAgA = local_partition(gA, tA, threadIdx.x); // (THR_M,THR_K,k)" ??
The first is constructing an MMA partitioner from a TiledMMA (which is usually used to create tCxY partitioned tensors) and the second is partitioning with respect to the thread layout tA.
Emmmm, so the output is the same, right? I mean, outputs are just "tensor" type, no matter how I get a "tensor", I can do "copy(tCrC, tCsC);" and cute will find a way to do the copy?
Thank you!!!
No, they are not related as they apply distinct partitioning patterns.
I suggest you review the existing documentation and wait for our updated documentation+examples coming soon.
Wow! That's cool! Actually I am writing relavant code pushed by my supervisor, haha.... Can not wait too long~He is a nice guy, I mean, anyway I am also interested in cute by myself also.
Eagerly waiting for your update!
Currently I only see one example code.... Do you possibly know more codes written by cute?
Oh, I noticed most gemm codes are written using previous cutlass.... Maybe I should use older version...? Because that doc is more....
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
Closing due to inactivitly