ziyuhuang123 comments

Results 17 comments of


                                            ziyuhuang123

[DOC] Could you write document for how to use cutlass on windows?

No, I mean....directly use in windows?

[BUG]: diffusion docker problem

oh, thanks!

[QST] What is the definition and difference of two partition methods in cute?

Like the example here, what does the output mean? ``` // Tile a tensor according to the flat shape of a layout that provides the coordinate of the target index....

[QST] What is the definition and difference of two partition methods in cute?

So I print(A.shape), I still get a value, but actually it is different from normal tensor definition. So why here it is still a "tensor" object??? Confusing!

[QST] What is the definition and difference of two partition methods in cute?

It seems that step(_1, X) mean, for first dimension, divide as normal, for the second dimension, do not divide.

[QST] What is the definition and difference of two partition methods in cute?

Also, for detailed local_partition, how it divide the data? Do we have bank conflict(yes, we will have), and how can we avoid it?

[QST]What is stack and heap in GPU for cute? How is "make_tensor" function used in cute?

Emmm, thank you, I have read all three blogs you mentioned, but you are discussing cuda core ..... I am learning tensor core so I am reading cutlass. ?

[QST]What is stack and heap in GPU for cute? How is "make_tensor" function used in cute?

> Off topic: Just came across this issue (as a github-mancer). Based on your recent questions I assume you want to write gemm from ground up. And not to be...

[FEA] Can I use copy to store register value into shared memory?

I mean, using CuTe.

[FEA] Can I use copy to store register value into shared memory?

Thank you very much for your reply!!!! I noticed you are using "auto thr_mma = tiled_mma.get_slice(thread_idx);" So what is its difference with: "auto tAgA = local_partition(gA, tA, threadIdx.x); // (THR_M,THR_K,k)"...