XGBoost.jl
XGBoost.jl copied to clipboard
Huge import latency caused by `Term`, `GPUArrays`, and `CUDA`
julia> @time_imports using XGBoost
17.5 ms SparseMatricesCSR
17.1 ms AbstractTrees
7.5 ms OrderedCollections
0.1 ms SnoopPrecompile
68.5 ms Parsers 11.99% compilation time
12.0 ms StructTypes
71.2 ms JSON3
0.2 ms DataValueInterfaces
1.2 ms DataAPI
0.1 ms IteratorInterfaceExtensions
0.1 ms TableTraits
13.6 ms Tables
3.7 ms DocStringExtensions 75.50% compilation time
62.0 ms Highlights
1.3 ms MyterialColors
0.1 ms UnPack
0.3 ms Parameters
3.1 ms ProgressLogging
1.2 ms UnicodeFun
1.8 ms CodeTracking
958.0 ms Term
4.1 ms CEnum
20.1 ms Preferences
0.4 ms JLLWrappers
5.7 ms LLVMExtra_jll 59.12% compilation time
186.4 ms LLVM 41.51% compilation time (100% recompilation)
0.4 ms ExprTools
132.5 ms TimerOutputs 11.86% compilation time
551.1 ms GPUCompiler 1.29% compilation time
0.4 ms Adapt
0.1 ms Reexport
3.5 ms GPUArraysCore
876.0 ms GPUArrays
0.6 ms Requires
8.6 ms BFloat16s
0.7 ms CompilerSupportLibraries_jll
32.6 ms RandomNumbers 33.48% compilation time (13% recompilation)
7.7 ms Random123
1.2 ms Compat
82.5 ms ChainRulesCore
11.3 ms AbstractFFTs
1625.2 ms CUDA 0.75% compilation time
16.4 ms CUDA_Driver_jll 94.73% compilation time
0.2 ms CUDA_Runtime_jll
2.4 ms XGBoost_jll
9.5 ms XGBoost
I suppose an argument can be made for removing Term
, but how would you propose to remove CUDA
without removing important GPU features?
I'm wondering if you just need GPUArrays.jl
for interface
Ah, that's a good point... technically this supports CuArray
and not any other type of GPU array, but I suppose it wouldn't be so bad to use GPUArray
and let it crash in some other way of somebody tries to use something other than a CuArray
(pretty sure libxgboost will crash somewhere).
Anyone else have opinions on this? I'm definitely willing to do the PR and make the change if there is consensus, but I'm not completely confident that nobody will take issue with it.
yeah, thanks for the GPU support, I'm just experience slow loading time on my poor cluster login node so no rush
IMO even GPUArrays alone seems to increase loading times quite significantly. It's nice to have GPU support but personally usually I do not have access to dedicated GPUs, so to me it rather seems to be an extension than a basic part of the package (also highlighted by the fact that GPU support was added only recently). Maybe it would be better to use weak dependencies on Julia >= 1.9 and Requires on older Julia versions (https://pkgdocs.julialang.org/dev/creating-packages/#Conditional-loading-of-code-in-packages-(Extensions)).
what about https://github.com/JuliaGPU/GPUArrays.jl/tree/master/lib/GPUArraysCore
It's nice to have GPU support but personally usually I do not have access to dedicated GPUs, so to me it rather seems
yeah I agree with this
I'm not sure exactly what can be done with weak dependencies, but I'm certainly open to exploring it once 1.9 is released.
Personally I'm not too fond of the argument that stuff should be removed just because it has to compile. Compilation is just part of life, it's an issue for the compiler, not individual packages. That dependencies should be trimmed where possible seems like a significantly better argument to me.
That said, I'm at least open to all the specific proposals made here: yes using GPUArrays
only sounds good to me, I'm happy to embrace weak dependencies, and arguably the functionality I'm using Term for belongs in another package. On the other hand I don't like the idea of having a separate package for GPU support at all.
Of course I'm not the only maintainer of this, so my opinion is hardly authoritative.
I too am having issues with unexpectedly depending on CUDA. In my case I am using PackageCompiler - here's a (cut down) snippet:
julia> using PackageCompiler; create_app(".", "dist")'
PackageCompiler: bundled artifacts:
├── CUDA_Runtime_jll - 1.858 GiB
└── ...
Total artifact file size: 2.134 GiB
I would really like my bundle to be 2GB smaller. I'm assuming just using GPUArrays
will at least fix the worst of the problem for now, and we can look at weak dependencies later?
CUDA_Runtime_jll - 1.858 GiB
Are you running on a machine with CUDA 11? If so, you might be able to eliminate the large CUDA artifacts by switching. The CUDA_Runtime_jll
dependency comes from XGBoost_jll
when it is build with CUDA. The non-cuda builds do not have it.
https://github.com/JuliaPackaging/Yggdrasil/blob/2b49bcda328332a02b0d6ee3926bafb3f031d026/X/XGBoost/build_tarballs.jl#L92
I wonder if we would be able to handle the CUDA deps with the new conditional package loading: https://pkgdocs.julialang.org/dev/creating-packages/#Conditional-loading-of-code-in-packages-(Extensions)
I wonder if we would be able to handle the CUDA deps with the new conditional package loading:
That was what I was referring to with weak dependencies in https://github.com/dmlc/XGBoost.jl/issues/156#issuecomment-1378105194 :slightly_smiling_face: It's great, I'm already using it in a few packages, but it requires Julia 1.9 - if one wants to support the conditional features on older Julia versions one either has to add the weak dependencies as hard dependencies on these Julia versions or use Requires (which does not support precompilation). In the case of XGBoost, the most natural approach would seem to keep the already existing dependencies as hard dependencies on Julia < 1.9.
Just to add that at RAI we're in exactly the same situation @andyferris mentioned above, where we're using PackageCompiler and do not want to include CUDA_Runtime_jll
or CUDA_Driver_jll
.
But given these deps seem to have been added in XGBoost_jll
at v1.7.1+1
i don't think there's even a way to pin
the version to avoid getting these? And so what we're having to do is just manually not check-in the XGBoost-related changes to the Manifest.toml, which is really disruptive and error-prone
e.g. when updating a unrelated package we see things like
(Repo) pkg> add --preserve=all XUnit @1.1.5
Resolving package versions...
Updating `~/Project.toml`
[3e3c03f2] ↑ XUnit v1.1.4 ⇒ v1.1.5
Updating `~/Manifest.toml`
[3e3c03f2] ↑ XUnit v1.1.4 ⇒ v1.1.5
[4ee394cb] + CUDA_Driver_jll v0.3.0+0
[76a88914] + CUDA_Runtime_jll v0.3.0+2
⌃ [a5c6f535] ↑ XGBoost_jll v1.7.1+0 ⇒ v1.7.1+1
Info Packages marked with ⌃ have new versions available and may be upgradable.
Precompiling project...
✓ XGBoost_jll
✓ XUnit
✓ XGBoost
✓ Repo
4 dependencies successfully precompiled in 42 seconds. 259 already precompiled. 1 skipped during auto due to previous errors.
We will definitely demote both CUDA
and Term
to weak dependencies once 1.9 is officially released. In the meantime, if someone wants to replace CUDA
with GPUArrays
, we can probably make that work. Otherwise, it looks like it's just going to have to wait. Suffice it to say, this problem is hardly unique to this package.
i don't think there's even a way to pin the version to avoid getting these?
@nickrobinson251 I did manage to pin both XGBoost at 2.0.2 and XGBoost_jll at 1.6.2+0, and that seemed to work for me.
Looking forward to Julia 1.9 still :)
As a follow up of this, I see there are some weakdep stuff in the Project.toml, and when installing on Julia 1.9.2 I get this:
(@v1.9) pkg> add XGBoost
Resolving package versions...
Updating `~/.julia/environments/v1.9/Project.toml`
[009559a3] + XGBoost v2.3.0
Updating `~/.julia/environments/v1.9/Manifest.toml`
[fa961155] + CEnum v0.4.2
[9a962f9c] + DataAPI v1.15.0
[e2d170a0] + DataValueInterfaces v1.0.0
[82899510] + IteratorInterfaceExtensions v1.0.0
[692b3bcd] + JLLWrappers v1.4.1
[0f8b85d8] + JSON3 v1.13.1
[69de0a69] + Parsers v2.7.1
[a0a7dd2c] + SparseMatricesCSR v0.6.7
[856f2bd8] + StructTypes v1.10.0
[3783bdb8] + TableTraits v1.0.1
[bd369af6] + Tables v1.10.1
[009559a3] + XGBoost v2.3.0
[4ee394cb] + CUDA_Driver_jll v0.5.0+1
[76a88914] + CUDA_Runtime_jll v0.6.0+0
[1d63c593] + LLVMOpenMP_jll v15.0.4+0
[a5c6f535] + XGBoost_jll v1.7.5+2
[4af54fe1] + LazyArtifacts
[37e2e46d] + LinearAlgebra
[a63ad114] + Mmap
[2f01184e] + SparseArrays
[10745b16] + Statistics v1.9.0
[4607b0f0] + SuiteSparse
[8dfed614] + Test
[e66e0078] + CompilerSupportLibraries_jll v1.0.5+0
[4536629a] + OpenBLAS_jll v0.3.21+4
[bea87d4a] + SuiteSparse_jll v5.10.1+6
[8e850b90] + libblastrampoline_jll v5.8.0+0
I see that Term
and CUDA
aren't installed, which is great, but there seems to be a bunch of GPGPU stuff downloaded regardless (this is a lot of stuff for a docker image intended for a machine that doesn't even have an nVidia GPU). Do we need to do the weakdep thing in XGBoost_jll too? (Will that work? Will the XGBoost_jll .so/.dll load and run fine in CPU mode if it can't find the CUDA .so/.dll files?)
Yes, the JLL still pulls in CUDA JLL dependencies. I asked about weakdeps/optional dependencies and GPU-/non-GPU binaries in some issue in Yggdrasil a while ago but there does not seem to be a solution to this problem yet. (Some other libraries are built both in a GPU and non-GPU version on Yggdrasil but last time I checked there was no Julia package actually tried to combine them - also just depending on both would still pull in all the undesired binaries...).
Ah I see - thank you.