[Feature]: Add ability to disable and enable categories at runtime
Suggestion Description
Hello,
I am currently running inference on a large model using pytorch and multiple processes. I manage to gather perfetto traces however the output is very large. I am only interested in some iterations in the middle. I am currently using: https://github.com/ROCm/omnitrace/pull/235 to reduce the size of the trace as much as possible. However, I would really like to have a more fine grained option that would allow me to decide when to start and stop the tracing (and do that multiple during a run). I went through the source code quickly and it seems that this enable/disable category feature is used for the delayed trace start.
Operating System
Ubuntu
GPU
No response
ROCm Component
No response
I'm a bit confused by what you are asking for here. Is this request not satisfied by the OMNITRACE_ENABLE_CATEGORIES and OMNITRACE_DISABLE_CATEGORIES configuration options noted in #235?
https://github.com/ROCm/omnitrace/blob/0cf017251ec5133d76a153148925b555c897a431/source/lib/core/config.cpp#L677
https://github.com/ROCm/omnitrace/blob/0cf017251ec5133d76a153148925b555c897a431/source/lib/core/config.cpp#L685
Hi, thank you for the quick reply.
Reading my comment I admit this was not very clear.
From what I understand these variables should be set at the start of omnitrace run and define the categories to trace for the whole run. Unless, I am misunderstanding something these variables cannot be changed during a "tracing run". What I would like to do is something like this:
omnitrace-run --trace-stopped -- python my_script.py
And inside my my_script.py:
for i in range(100):
...code_that_should_not_be_traced...
if i > 10 and i < 15:
(handle to loaded so file).start()
...traced_code...
if i > 10 and i < 15:
(handle to loaded so file).stop()
...code_after_section_of_interest...
Thank you again for your time and help.
Hi @ADGLY, sorry for the delay. Perhaps selective instrumentation and source instrumentation can serve your purpose?
@ADGLY I'm going to close this issue due to inactivity. If the suggested solution does not work, feel free to re-open the ticket and we can look into it further.