cutlass icon indicating copy to clipboard operation
cutlass copied to clipboard

Fix potential smem misaligned address issue for ws pingpong kernel.

Open Junkai-Wu opened this issue 1 year ago • 1 comments

Current implementation which puts epilogue before mainloop in SharedStorage could cause smem misaligned address issue when using tma load and smem size of epilogue is not 128B aligned. Reverse the order to make sure smem address of mainloop is 128B aligned.

Junkai-Wu avatar Oct 14 '24 09:10 Junkai-Wu

I've seen this misaligned address error as well. Same with the WS cooperative kernel.

tridao avatar Oct 15 '24 06:10 tridao

Reversing the order of epilogue and mainloop will cause performance issue for pingpong kernel and cooperative kernel. Will seek another way to solve this issue.

Junkai-Wu avatar Nov 12 '24 02:11 Junkai-Wu

@tridao Yes, they are the same issue. We are communicating with our compiler team colleague to solve the issue.

Junkai-Wu avatar Nov 12 '24 02:11 Junkai-Wu