IJulia.jl
IJulia.jl copied to clipboard
Add precompile statements to reduce startup latency
So @timholy and I have done a little preliminary looking into adding precompile statements to reduce the latency of IJulia kernels, and it seems to us that a major problem is that there's no clear way to run a PrecompileTools.jl or SnoopCompile.jl workflow to actually get the relevant precompile statements from a running notebook, so we're forced to resort to --trace-compile kernels.
@stevengj do you have any insight into how we might be able to set up a PrecompileTools.jl workflow that starts up and spins down a kernel during precompilation?
Maybe just copy https://github.com/JuliaLang/IJulia.jl/blob/cc2a9bf61a2515596b177339f9a3514de8c38573/src/kernel.jl but comment out the final IJulia.waitloop() line? Then run julia kernelcopy.jl.
I guess you could also add an environment variable or something to kernel.jl to make it do this, so that you don't need to edit the file. Or do some other refactoring of kernel.jl.
You may have to undo the IO redirects at the end, i.e.
redirect_stdout(IJulia.orig_stdout[])
redirect_stderr(IJulia.orig_stderr[])
redirect_stdin(IJulia.orig_stdin[])
Potentially you could also manually call IJulia.execute_request, e.g. something like:
execute_request(requests[], Msg(String[], Dict(), Dict(["code"=>"nothing\n", "silent"=>true, "user_expressions"=>[]])))
I've been trying to get that to work but so far have not had any success. lots of those functions simply error if run during precompilation, and what I am able to keep without errors don't seem to have any effect on the time to start for a kernel.
I also would like to improve the TTFX. Some random thoughts (as someone who only recently started looking at the code):
- A lot of the codebase is not tested, which I'm aiming to improve in the future. Having something run in a test suite has a lot in common with running in a precompile workload so I think once the code is testable it'll be easier to work with under precompilation.
- There's a lot of global variables with pointers. Those all need to be cleaned up after the workload so they aren't serialized or we'll run into things like this: https://github.com/JuliaInterop/ZMQ.jl/issues/241. It may be best to bundle all of them into a
Kernelstruct that can beclose'd (which would make testing easier). waitloop()(and the handlers it calls) is the thing we really want to precompile and the only way to 'cancel' it is by throwing an exception to the task or sending ashutdown_requestto the kernel (which has the side effect of exiting the process so we probably shouldn't use that). I think it should be refactored to allow a graceful cancellation. This would make testing easier and it would make it easier to run in a precompilation workload.- Not sure if the separate heartbeat thread will cause problems. We may have to explicitly join it: https://docs.libuv.org/en/v1.x/threading.html#c.uv_thread_join. Or switch to using
@threadcall.
@MasonProtter, would you by any chance have time to look into some of these things with me? I will try to find some time for it over the next few weeks/months.
Yes I would be interested in collaborating on this!
I realized yesterday that we had a long-standing bug in the shutdown handler that would cause it to hang: #1163. That might have inadvertently increased the perceived TTFX when restarting the kernel because Jupyter would wait for a timeout and then kill the process 🙃
IJulia feels decently snappy now so I'm happy to leave this closed, but if anyone else wants to try improving TTFX please feel free ❤️