[QST] [CuTeDSL] Nsight Compute Profiler Link to Source Code
What is your question?
When profiling CUDA/CUTLASS, the profiler can provide line-by-line profiling for user code, in addition to PTX and SASS. Triton can also do this, likely because its compiler tracks source locations. I believe CuTeDSL has a similar feature since it tracks source locations too. However, I’m unsure how to enable this, as the default ncu output only shows SASS. Do you happen to know how to enable detailed profiling if it’s possible?
Good suggestion! It's a very useful feature that we are considering to add ( ETA is TBD ).
Just want to echo this, would make it much easier than just reading the SASS
@brandon-yujie-sun
I would also very much like this. I think the lack of the source-mapping also makes debugging IMAs with compute-sanitizer much harder as well.
Thanks for all the inputs. Investigation is in progress.
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
Adding that this is perhaps a dealbreaker for our use case of CuTe DSL, profiling a kernel is highly important during high performance kernel development.
Folks, 4.3 dev added source location tracking for DSL APIs which enables the source code correlation for DSL codes with the profiling and debugging. Please let us know if you see any issues with that.