xgboost icon indicating copy to clipboard operation
xgboost copied to clipboard

[CI] Upload wheel variants for CUDA 11 and 12

Open hcho3 opened this issue 1 year ago • 2 comments

Currently, we build two wheel variants: xgboost-cpu (which excludes GPU code) and xgboost (where the GPU code targets CUDA 12.4). In #10729, xgboost is found to conflict with another package using CUDA 11.

Following the practices of RAPIDS, we should distribute separate wheels targeting CUDA 11 and CUDA 12.

Proposal. Build four wheel variants.

  • xgboost-cpu: excludes the GPU code
  • xgboost-cu11: builds GPU code with CUDA 11, depends on nvidia-nccl-cu11.
  • xgboost: builds GPU code with CUDA 12, depends on nvidia-nccl-cu12.
  • xgboost-cu12: a stub package directing users to install xgboost. Something like https://pypi.org/project/cuml. The stub package can be replaced with a real one when the main package xgboost transitions to CUDA 13.

Prerequisites

  • https://github.com/pypi/support/issues/4695
  • Need to file another issue to increase file size for xgboost-cu11 to 200 MiB.
  • Reduce the wheel size by dropping some archs, like sm_50.
  • #10803

hcho3 avatar Sep 06 '24 02:09 hcho3

@jameslamb Can you review my proposal? I kept the xgboost name, which would work for the majority of the use cases. Users would opt into xgboost-cpu and xgboost-cu11 as special needs arise.

hcho3 avatar Sep 09 '24 20:09 hcho3

@hcho3 let me start by saying I am SO SORRY it took over a month to get back to you!

Thanks for redirecting me here from #10803. I've read this and the motivating issue (#10729) now. I support this proposal.

I think the approach you've proposed is a good one... it minimizes disruption to existing users of xgboost, preserves pip install xgboost "just working" on CUDA 12 systems, and I do think it'd fix the issue from #10729.

I still think you should exploring modifying the packager/ code for this purpose before involving rapids-dependency-file-generator (for the reasons I mentioned in https://github.com/dmlc/xgboost/issues/10803#issuecomment-2333181102).


I kept the xgboost name, which would work for the majority of the use cases.

Thank you! This was what I was most worried about in https://github.com/dmlc/xgboost/issues/10803#issuecomment-2333181102, I think because there you described "re-naming" the package, which I assumed meant that xgboost might go away.


xgboost-cu12: a stub package directing users to install xgboost ... Something like https://pypi.org/project/cuml.

I don't think that xgboost-cu12 should be a stub that does wheel-downloading with Python code like https://pypi.org/project/cuml

Instead, I'd package this as a wheel with only the minimum files required to be a valid wheel file, and with a dependency == pinning it to exactly one version of xgboost, like this:

[project]
name = "xgboost-cu12"
version = "2.2.1"
dependencies = [
    "xgboost==2.2.1"
]

Benefits:

  • pip install xgboost-cu12 will end up pulling in xgboost automatically
  • pip install xgboost xgboost-cu12 (which could happen accidentally, via transitive dependencies) will be safe
  • == pin means that the names could be used interchangeably for all versions of XGBoost where CUDA 12 is the default

jameslamb avatar Oct 14 '24 03:10 jameslamb

Updated proposal, with CUDA 12 and 13.

  • xgboost-cpu: excludes the GPU code
  • xgboost: builds GPU code with CUDA 12, depends on nvidia-nccl-cu12.
  • xgboost-cu12: a stub package, which will automatically install xgboost.
  • xgboost-cu13: builds GPU code with CUDA 13, depends on nvidia-nccl-cu13.

hcho3 avatar Sep 04 '25 19:09 hcho3

I think that's a good path forward! Thanks for writing that out.

jameslamb avatar Sep 04 '25 19:09 jameslamb