scalene icon indicating copy to clipboard operation
scalene copied to clipboard

Signal Scalene to report what it has gathered

Open garethky opened this issue 1 year ago • 11 comments

Is your feature request related to a problem? Please describe. I'm trying to use Scalene to profile a large, multi threaded, Python program that's designed to be a system service. The only way the service exits is via the shutdown command. There is no internal way to cleanly exit the service. When testing on the command like I have to kill the process to make it exit.

Basically I need a profiling report from a chunk of runtime of a program without the program having to exit.

Describe the solution you'd like I'd like a way to signal to Scalene that it should prepare the report based on the data it has already gathered and write it to the output file. Something like:

$ scalene --profile-all --outfile ~/profile.json service.py &
$ python3 -m scalene.profile --off --pid 1337
$ echo "10 minutes later..."
$ python3 -m scalene.profile --off --pid 1337
$ python3 -m scalene.profile --report --pid 1337

and the report shows up at ~/profile.json immediately.

If I later toggle the profiling and ask for a report, it will be a new report for the new chunk of runtime that was profiled.

Describe alternatives you've considered The only way I can get output from this program is with --profile-interval

Additional context The service in question is the open source klipper 3D printer 'firmware', which is mainly written in Python.

I think other people have tried to ask for this but they didn't make clear what they needed. Some have asked to connect Scalene to a running process. This is a way to get a report without having to capture the entire run of the program or wait for it to exit to see results.

garethky avatar Jun 29 '23 05:06 garethky

Hi @garethky - thanks for your note. There are not many signals available that we haven't already hijacked, so I'm leery of grabbing another one. That said, what if --off always produced a new profile? I'm trying to think of a use case for --off that would be compromised by that update. Another option is to make --off and --on actually output the same signal, but just toggle the behavior, and then add a --report or similar.

emeryberger avatar Jun 30 '23 00:06 emeryberger

I would be fine with --off causing the output to be written to. That makes a lot of sense to me.

Is there any problem in the case where an --outfile isn't specified and you need to write to stdout?

garethky avatar Jun 30 '23 03:06 garethky

I am thinking it would make sense to suppress the output for that case, but am not certain. Thoughts?

emeryberger avatar Jun 30 '23 18:06 emeryberger

Prototype of functionality here: python3 -m pip install --force-reinstall git+https://github.com/plasma-umass/scalene@off_report

emeryberger avatar Jun 30 '23 18:06 emeryberger

Sorry, I haven't forgotten about this, just got very busy. What you are proposing looks good to me.

I ended up moving to using Scalene with pytest instead. It turns out that the project I'm working on uses Greenlets. These are not the same as multiprocessing threads and I cant see any profiler data from inside them.

garethky avatar Jul 17 '23 22:07 garethky

@emeryberger Why would you delete the git+https://github.com/plasma-umass/scalene@off_report , there's no way to use scalene for real applications without this feature.

uriariel avatar Jul 24 '23 15:07 uriariel

It doesn't seem to work, I try to:

  1. start scalene --outfile FILE main.py
  2. python3 -m scalene.profile --off --pid PID
  3. python3 -m scalene.profile --on --pid PID

and I still doesn't see FILE

uriariel avatar Jul 24 '23 15:07 uriariel

This behavior has now been merged into the default branch; you can try it with python3 -m pip install --force-reinstall git+https://github.com/plasma-umass/scalene.

emeryberger avatar Jul 25 '23 06:07 emeryberger

It doesn't seem to work, I try to:

  1. start scalene --outfile FILE main.py
  2. python3 -m scalene.profile --off --pid PID
  3. python3 -m scalene.profile --on --pid PID

and I still doesn't see FILE

@emeryberger

uriariel avatar Jul 25 '23 09:07 uriariel

Are you starting the first job in the background? (that is, followed by &?)

I include a full example that works for me, profiling a program with an infinite loop (to make sure it runs long enough):

main.py:

x = 1
while True:
  x += 1

Below is the sequence of instructions (with some warnings and duplicate messages removed):

% python3 -m scalene --cli --json --outfile main.json main.py &
[3] 57537
% /Users/emery/git/scalene/scalene/scalene_parseargs.py:65: SyntaxWarning: invalid escape sequence '\['
  """
Scalene now profiling process 57538
  to disable profiling: python3 -m scalene.profile --off --pid 57538
  to resume profiling:  python3 -m scalene.profile --on  --pid 57538

% ls -l main.json
ls: main.json: No such file or directory
% python3 -m scalene.profile --off --pid 57538
Scalene: profiling turned off.
% ls -l main.json 
-rw-r--r--  1 emery  staff  4074 Jul 25 06:20 main.json

emeryberger avatar Jul 25 '23 10:07 emeryberger

@emeryberger I'm running the application inside a container with the following entry point: ENTRYPOINT ["scalene", "--memory", "--html", "--outfile", "my.profile", "main.py"]

The container keeps running even if I turn the profile off, but the output file is not written

uriariel avatar Jul 25 '23 11:07 uriariel