pixi
pixi copied to clipboard
feat: Add pytorch example
Installing pytorch is a very common use case, so it's good to have one or more examples on pixi!
The current one is demonstrating three different ways to install pytorch which very insightful, but I feel an additional example that better reflects a "real world" situation would help the downstream users as well.
I am saying that because I recently had to set up a repo with pytorch, and it took me some time to get the feature/env logic right.
I ended up with two approaches:
Two envs
default: contains cpu only torchcuda: contains cuda only torch
[project]
name = "xxx"
version = "0.1.0"
description = "Add a short description here"
authors = ["xxx <[email protected]>"]
channels = ["https://fast.prefix.dev/conda-forge"]
platforms = ["osx-arm64", "linux-64"] # let's ignore win-64 for now
[tasks]
[dependencies]
python = ">=3.11,<4"
transformers = ">=4.44,<5"
tokenizers = ">=0.19.1,<0.20"
jupyterlab = ">=4.2.5,<5"
pandas = ">=2.2.3,<3"
accelerate = ">=0.34.2,<0.35"
safetensors = ">=0.4.5,<0.5"
pytorch = ">=2.4,<3"
[environments]
cuda = { features = ["cuda"] }
default = { features = ["cpu"] }
[feature.cuda]
platforms = ["linux-64"]
system-requirements = { cuda = "12" }
dependencies = { pytorch = { build = "*cuda*" } }
[feature.cpu.dependencies]
pytorch = { version = "*", build = "*cpu*" }
Three envs but one empty
default: emptycpu: contains cpu only torchcuda: contains cuda only torch
[project]
name = "xxx"
version = "0.1.0"
description = "Add a short description here"
authors = ["xxx <[email protected]>"]
channels = ["https://fast.prefix.dev/conda-forge"]
platforms = ["osx-arm64", "linux-64"] # let's ignore win-64 for now
[tasks]
[feature.common.dependencies]
python = ">=3.11,<4"
transformers = ">=4.44,<5"
tokenizers = ">=0.19.1,<0.20"
jupyterlab = ">=4.2.5,<5"
pandas = ">=2.2.3,<3"
accelerate = ">=0.34.2,<0.35"
safetensors = ">=0.4.5,<0.5"
pytorch = ">=2.4,<3"
[environments]
cuda = { features = ["cuda", "common"] }
cpu = { features = ["cpu", "common"] }
[feature.cuda]
platforms = ["linux-64"]
system-requirements = { cuda = "12" }
dependencies = { pytorch = { build = "*cuda*" } }
[feature.cpu.dependencies]
pytorch = { version = "*", build = "*cpu*" }
You could also have three envs with common being used for the default env and the correct torch will be selected dynamically at install time.
Any feedback or ideas to improve the logics are welcome.
Feedback: I tried the first example locally, but this strategy does not seem to work if one uses a pyproject.toml as a manifest. Pixi complains even though the projects does include support for 'osx-arm64', feature 'cuda' does not support 'osx-arm64' (N.B.: I'm nearly sure this is a bug). The only way I can make it work in this situation is by having a default environment with support for both platforms, and then have a cuda feature that does not have a platforms entry, or dependencies like you suggested, pinning the cuda system-requirements.
I debugged the problem a bit more and I can confirm that the reason why this configuration does not work regards the addition of a pypi-dependencies section (which exists "by default" if you use a pyproject.toml as a manifest). To reproduce the issue with your setup, just add such a section with a dependency on you pixi.toml, e.g [pypi-dependencies] dummy = "*".
Here is a version of your example, and with pypi-dependencies, that will work with across different manifest files, and resolves correctly (at least today):
[project]
name = "xxx"
version = "0.1.0"
description = "Add a short description here"
authors = ["xxx <[email protected]>"]
channels = ["https://fast.prefix.dev/conda-forge"]
platforms = ["osx-arm64", "linux-64"] # let's ignore win-64 for now
[dependencies]
python = ">=3.11,<4"
pytorch = ">=2.4,<3"
[pypi-dependencies]
tabulate = "*"
[environments]
cuda = { features = ["cuda"] }
[feature.cuda]
system-requirements = { cuda = "12" }
That said, I agree this issue must be resolved so we can explicitly mark certain features only are sensical in the given platforms.
Thanks a lot for helping! The pypi-dependencies issue is known. I was indeed planning to create a more real-world example but since there are a lot of routes to take I started with only the basics so users could learn from those. I'll try to use your examples as well! Please tell me more!
I started with only the basics so users could learn from those.
Maybe two examples would actually be best given how tricky the torch situation can be sometime :-)
The one in this PR discussing the historical differences between the different channels and another one, more rooted in real-world with separate envs cuda/cpu.
@ruben-arts: IMO this problem has 2 parts:
- Make sure that all examples across the pixi documentation properly make use of pytorch with the right combination of channels (i.e. defaults+nvidia+pytorch OR conda-forge-only, do not mix). This would follow guidelines for installation from those packages and reduce pitfalls, like the one I struggled with when I started using pixi. The example by @hadim is well structured. I'd however simplify it even further to use the
defaultfeature (CPU) + acudafeature instead with nvidia/gpu support. Normally, you would not need to specify thebuildin these cases. I'd then add a big-fat-warning somewhere indicating that while #1051 is not solved users cannot add "pypi-dependencies" or use it from apyproject.tomlmanifest (which more often than not has a PyPI source dependency on the internally declared package).
(Example: One of the examples in the documentation is extremely convoluted and is not supported by pytorch or conda-forge. I think it would be better to remove it.)
- Prioritise solving #1051 to get rid of this important limitation. Not being able to add dependencies from PyPI that may not be available on conda-forge is a big-fat-bummer for data scientists and people doing machine learning. I hope you realise this.
After one solves #1051, we can go back to the examples and remove the warnings.
Example from @hadim simplified to the bare minimum (this resolves for me correctly using pixi version 0.30):
[project]
name = "pytorch-from-conda-forge"
channels = ["conda-forge"]
platforms = ["osx-arm64", "linux-64"]
[dependencies]
python = "3.*"
pytorch = "*"
# WARNING: Cannot add PyPI dependencies or use pyproject.toml as manifest while
# https://github.com/prefix-dev/pixi/issues/1051 is not fixed
# [pypi-dependencies]
# tabulate = "*"
[feature.cuda]
platforms = ["linux-64"]
system-requirements = { cuda = "12" }
[environments]
cuda = { features = ["cuda"] }
Similar example but with defaults + nvidia + pytorch channels (this also resolves correctly for me using pixi version 0.30):
[project]
name = "pytorch-from-pytorch"
channels = ["pytorch", "anaconda"]
platforms = ["osx-arm64", "linux-64"]
[dependencies]
python = "3.*"
pytorch = "*"
# WARNING: Cannot add PyPI dependencies or use pyproject.toml as manifest while
# https://github.com/prefix-dev/pixi/issues/1051 is not fixed
# [pypi-dependencies]
# tabulate = "*"
[feature.cuda]
channels = ["pytorch", "nvidia", "anaconda"]
platforms = ["linux-64"]
system-requirements = { cuda = "12.4" }
dependencies = { pytorch-cuda = { version = "12.4" } }
[environments]
cuda = { features = ["cuda"] }
Might be nice to have an example using pytorch-gpu from conda-forge if that is supported?
Might be nice to have an example using
pytorch-gpufrom conda-forge if that is supported?
Both pytorch-cpu and pytorch-gpu are only meta packages that depend on pytorch only. My understanding is recipes above already cover the possible scenarios with and without conda-forge respectively.
Please note that the defaults route is only "free" for organisations with less than 200 people (analysis).