ziyuhuang123 issues

Results 61 issues of


                                            ziyuhuang123

[QST]Why sm90 mma has prologue and mainloop?

**What is your question?** Here prologue is not only for initialize, it also has cute::gemm. Why? What is the benefit and meaning here? Overlap? How?

question

? - Needs Triage

inactive-30d

[QST]Where is the PipelineState defined in cutlass/include/cutlass/gemm/kernel/sm90_gemm_tma_warpspecialized_cooperative.hpp?

**What is your question?** ![image](https://github.com/user-attachments/assets/98eab07b-1903-425e-9439-5178169c52e4) Like here, I see many usage of PipelineState but find no definition. I do find some in other files like: ``` using PipelineState = cutlass::PipelineState;...

question

? - Needs Triage

[BUG]example48 debug version failed

**Describe the bug** I try to use cuda-gdb, and I add debug -g -G tag to example48, then it failed. Even before cuda-gdb. cuda-gdb team reports a same bug here:...

bug

? - Needs Triage

[QST]Why we have epilogue load in producer?

**What is your question?** I tried example48, and I find that in producer, the epilogue is not used at all!? I am puzzled that what is the function of producer-epilogue....

question

? - Needs Triage

[QST]What is these 0 and o in print?

**What is your question?** I print a tensor and get: ``` smem_ptr[16b](0x7fe900000c00) o Sw o _0 o (((_64,_256),_2)):(((_1,_64),_16384)) ``` What is these 0 and o in print? Where is the...

question

? - Needs Triage

[QST]How to print a variable's type?

**What is your question?** Like if I have a variable tensor_3d, how can I know its type? type(tensor_3d)???

question

? - Needs Triage

[QST] What is Sw<3, 3, 3> in print?

**What is your question?** ``` auto tensor_2d = make_tensor(tensor_3d.data(), make_shape(64, 256)); printf("tensor_2d\n"); print(tensor_2d); printf("\n"); print(tRS_sD); printf("\n"); print(bSG_sD); printf("\n"); print(gD_epi); printf("\n"); ``` ``` tensor_3d smem_ptr[16b](0x7f8f00000c00) o Sw o _0 o (((_64,_256),_2)):(((_1,_64),_16384))...

question

? - Needs Triage

[QST] In sm90, where we set the gridDim?

I know we have persistent block, but seems the block number is slightly higher than SM number. Where is it defined?

question

? - Needs Triage

[QST]The Persistent Tile Scheduler in CUTLASS?

Could you please explain how the persistent tile scheduler in CUTLASS works? Does it mean that a single CTA continuously processes multiple blocks, or is the work of different kernels...

question

? - Needs Triage

[QST]Why we use setmaxnreg? Does this change register/Occupancy?

setmaxnreg is a new feature since Hopper. I noticed this in cutlass: https://github.com/NVIDIA/cutlass/blob/eee0cab26c8eedea447eb3b58b3498eeba2294da/include/cutlass/gemm/kernel/sm90_gemm_tma_warpspecialized_cooperative.hpp#L446 From above, the consumer register is 232, the producer register is 40. Different warp can use different...

question

? - Needs Triage