IJulia.jl icon indicating copy to clipboard operation
IJulia.jl copied to clipboard

Add precompile statements to reduce startup latency

Open MasonProtter opened this issue 2 years ago • 5 comments

So @timholy and I have done a little preliminary looking into adding precompile statements to reduce the latency of IJulia kernels, and it seems to us that a major problem is that there's no clear way to run a PrecompileTools.jl or SnoopCompile.jl workflow to actually get the relevant precompile statements from a running notebook, so we're forced to resort to --trace-compile kernels.

@stevengj do you have any insight into how we might be able to set up a PrecompileTools.jl workflow that starts up and spins down a kernel during precompilation?

MasonProtter avatar May 03 '23 23:05 MasonProtter

Maybe just copy https://github.com/JuliaLang/IJulia.jl/blob/cc2a9bf61a2515596b177339f9a3514de8c38573/src/kernel.jl but comment out the final IJulia.waitloop() line? Then run julia kernelcopy.jl.

I guess you could also add an environment variable or something to kernel.jl to make it do this, so that you don't need to edit the file. Or do some other refactoring of kernel.jl.

You may have to undo the IO redirects at the end, i.e.

redirect_stdout(IJulia.orig_stdout[])
redirect_stderr(IJulia.orig_stderr[])
redirect_stdin(IJulia.orig_stdin[])

Potentially you could also manually call IJulia.execute_request, e.g. something like:

execute_request(requests[], Msg(String[], Dict(), Dict(["code"=>"nothing\n", "silent"=>true, "user_expressions"=>[]])))

stevengj avatar May 07 '23 23:05 stevengj

I've been trying to get that to work but so far have not had any success. lots of those functions simply error if run during precompilation, and what I am able to keep without errors don't seem to have any effect on the time to start for a kernel.

MasonProtter avatar May 30 '23 20:05 MasonProtter

I also would like to improve the TTFX. Some random thoughts (as someone who only recently started looking at the code):

  • A lot of the codebase is not tested, which I'm aiming to improve in the future. Having something run in a test suite has a lot in common with running in a precompile workload so I think once the code is testable it'll be easier to work with under precompilation.
  • There's a lot of global variables with pointers. Those all need to be cleaned up after the workload so they aren't serialized or we'll run into things like this: https://github.com/JuliaInterop/ZMQ.jl/issues/241. It may be best to bundle all of them into a Kernel struct that can be close'd (which would make testing easier).
  • waitloop() (and the handlers it calls) is the thing we really want to precompile and the only way to 'cancel' it is by throwing an exception to the task or sending a shutdown_request to the kernel (which has the side effect of exiting the process so we probably shouldn't use that). I think it should be refactored to allow a graceful cancellation. This would make testing easier and it would make it easier to run in a precompilation workload.
  • Not sure if the separate heartbeat thread will cause problems. We may have to explicitly join it: https://docs.libuv.org/en/v1.x/threading.html#c.uv_thread_join. Or switch to using @threadcall.

@MasonProtter, would you by any chance have time to look into some of these things with me? I will try to find some time for it over the next few weeks/months.

JamesWrigley avatar Jan 29 '25 11:01 JamesWrigley

Yes I would be interested in collaborating on this!

MasonProtter avatar Jan 29 '25 12:01 MasonProtter

I realized yesterday that we had a long-standing bug in the shutdown handler that would cause it to hang: #1163. That might have inadvertently increased the perceived TTFX when restarting the kernel because Jupyter would wait for a timeout and then kill the process 🙃

JamesWrigley avatar Jun 02 '25 11:06 JamesWrigley

IJulia feels decently snappy now so I'm happy to leave this closed, but if anyone else wants to try improving TTFX please feel free ❤️

JamesWrigley avatar Aug 22 '25 17:08 JamesWrigley