Technical explanation for why Package Compilier takes so long?
Hello,
I'm looking to see if there's a technical explanation somewhere for why PackageCompiler takes so long to build a sysimage. Given that it takes me ~30 seconds to build the Linux kernel, I'm surprised that compiling a Julia sysimage/executable can take as much as 20 minutes.
A big chunk of the time is spent in LLVM where it has to compile all the code for Base Julia itself as well as the new code that is added.
If you want a more detailed answer you could try running it through a profiler.
I'm not entirely sure how best to use a profiler to actually answer this.
In the respect that "PackageCompiler" is all about compiling "LLVM compiling code" sounds pretty reasonable, but I still fail to comprehend the scale. If it takes me 30 seconds to build the Linux kernel and 5 minutes to compile Firefox, I just cannot fathom why it would so much longer to PackageCompile my Julia project when it is multiple orders of magnitude less complex than say Firefox.
Could it really be that it is ~4x as much work for LLVM to compile my GLMakie data visualiser than Firefox due to "Julia things"??
Another difference is that Julia is single threaded while with c and c++ code you can compile translation units in parallel which can be a big boost on modern CPUs. There were also some bugs in Julia that forced PackageCompiler to basically do twice the work but that should be fixed in new Julia versions.
Without more details about your project it is hard to say much more. Julia isn't slow "on purpose" but if you can post your project you are trying to compile and what functions in PackageCompiler you are calling and Julia version etc it is possible to do some analysis and see what the big bottle necks are.
@tecosaur and @KristofferC - not to threadjack, but I'm excited by the opportunity to debug a similar situation. Would you mind if I contributed it to this discussion?
Go ahead.
Here is the setup I have been using: https://gitlab.com/alhirzel/julia_sysimage . Created this for the sake of reproducibility, and I also plan to package it as a Singularity container for internal use (once the package set stabilizes).
With the current package set and Julia 1.8.3, it compiles in about 5m30s on a Ryzen 9 5900X and more like 14min on my daily driver. I'm particularly interested in how I could profile this compilation process to see if there are packages that may be worth excluding, or if there are optimization opportunities.
Since you mention that there is a lot of time spent compiling Julia, I wonder if there is an opportunity to re-use the existing ("upstream") Julia binaries to save some of the compilation time.
Hi @KristofferC just out of curiosity, does Julia 1.9 or 1.10 solve the bugs mentioned here?
There were also some bugs in Julia that forced PackageCompiler to basically do twice the work but that should be fixed in new Julia versions.
Those bugs should be fixed in PackageCompiler v2.1.3, and v2.1.4 is compatible with latest julia nightly