omniperf icon indicating copy to clipboard operation
omniperf copied to clipboard

Delimiters to profile blocks of application code

Open coleramos425 opened this issue 2 years ago • 1 comments

Describe the suggestion Allow breakpoint/delimiters to specify "blocks" for profiling in application source code

Justification This came up in the context of training ML models and enabling users to target specific stages in their ML training pipeline. In these multi-stage codes it would be helpful for users to understand performance and bottlenecks in different areas of execution

Implementation There's a few ways this could be done, but the first that comes to mind is by leveraging rocscope. A modified version of the rocomni plugin could be used to gather counters for these user-defined blocks.

See internal planning page for more info...

Additional Notes This would also lend itself nicely to an eventual VSCode extension

Originally posted by @coleramos425 in https://github.com/AMDResearch/omniperf/discussions/153#discussioncomment-6846892

coleramos425 avatar Aug 28 '23 21:08 coleramos425

This would be easily supportable with working hipProfilerStart() and hipProfilerStop() functions. I’ll make sure those functions are actually handled properly in rocprofiler v2.

jrmadsen avatar Aug 29 '23 02:08 jrmadsen