flit
flit copied to clipboard
How to get `flit publish` to instruct pip to install torch per PyTorch instructions?
I figured this was worth opening up a new issue rather than continuing on the thread of #369. Hope that's OK 🙂. For anyone that comes here, the crux is that you can't get flit publish to prevent the user from needing to manual specify where to find the torch distribution without limiting the user to a specific Python version and platform (e.g. Python 3.8 + Windows).
Original question from https://github.com/takluyver/flit/issues/369#issuecomment-944894101
@takluyver, thanks for the quick response. Maybe this just reflects some lack of understanding on my part, but the idea is if a PyPI package depends on the GPU-version of PyTorch, any workflow I've tried using
flit publishdoesn't seem to letpipknow that it needs to look in https://download.pytorch.org/whl/torch_stable.html (per instructions at https://pytorch.org/get-started/locally/). In general, the preference is toconda install pytorch cudatoolkit=11.1 -c pytorch -c conda-forgeanyway, but I'm hoping to provide both a PyPI and a conda distribution for my packages (naturally I'm only focused on the PyPI version in the context offlitand this post). The only workarounds I've come up with are at the user-level. For brevity,<URL>is https://download.pytorch.org/whl/torch_stable.html. I can either:
- remove
torchentirely from the requirements and tell users to installtorchseparately via e.g.pip install torch==1.9.1+cu102 -f <URL>- leave
torchreq inpypackage.tomland tell users that they need to include-f <URL>or--find-links <URL>in their initial pip install- leave
torchreq inpypackage.tomland tell users to set their environment variable (e.g.import os; os.environ["PIP_FIND_LINKS"]="<URL>"For
<my-package>, people might naively trypip install <my-package>and wait a while, only to find that it produces an error message that it couldn't find the requestedtorchversion, e.g.1.9.1+cu111. Then they might trypip install torch, at which point it might give them something like1.9.1. At that point, it should install just fine, but once they try to actually use it, it will complain and error out about various things such as not being able to find cuDNN and give some suggestions about installing CUDA Toolkit or manipulating paths. The main worry is that people will take up to a few hours to get it installed or worse, give up. Oftentimes, my target audience might have limited familiarity with version-control, package managers, and coding in general, hence my extra worry here. Other than this, I'm not sure it's worth moving to something more complex likepoetry.Does what I'm describing make sense? Are there workarounds I'm not considering?
Response from takluyver (from https://github.com/takluyver/flit/issues/369#issuecomment-945085501)
Aha, I understand. This is quite a different question from how you control what happens with
flit install. 🙂 It comes back to the distinction between abstract and concrete requirements. What you specify in package metadata are abstract requirements, i.e. you specify what is needed, but not where to get it - that's up to the recipient.This has some limitations for pytorch & similar packages, because our packaging infrastructure doesn't know about GPUs or GPU interfaces like CUDA (see this discussion for more info). The way pytorch has chosen to work around this, there is AFAIK no way to specify an abstract requirement on a particular flavour of pytorch. I believe this is the same if you use Poetry, or any tool for releasing packages as wheels. It may be possible to work around it with setuptools and source-only releases, because setup.py can run arbitrary code - but you're fighting your tools if you go that way.
There is one other workaround. The standard format for specifying requirements in packages (PEP 508) allows for specifying a package to be installed from a precise URL,
name @ https://.... But this is the URL of the package itself, not an index page, so it's an absolutely concrete dependency: one version of Pytorch for one version of Python and one platform.
Thank you for clarifying. This was very informative! Those links were great. So in the context of my issue, options to --find-links or --extra-index-url are things that need to happen at the user level.
It sounds like PyPI is very aware of the issues related to GPUs. In this case, the best option for me might be to state conda install as the preferred method and later in documentation state that there is a pip install option with emphasis that it requires a specific flag. All-in-all, it probably won't matter too much which workaround I use as long as I have a proper conda installation. It's clear to me now that PyPI and PyTorch have some portions where they won't see eye-to-eye in all cases, and I think it's pretty reasonable to expect a bit more from users if they follow the alternative installation instructions.
I appreciate you mentioning the other option of a specific URL. While I probably won't implement this right now because of the downsides you mentioned, to clarify, would that look like the following? (example for Windows on Python 3.8, based on docs examples and after a flit init)
[build-system]
requires = ["flit_core >=3.2,<4"]
build-backend = "flit_core.buildapi"
[project]
name = "astcheck"
authors = [
{name = "Thomas Kluyver", email = "[email protected]"},
]
readme = "README.rst"
classifiers = [
"License :: OSI Approved :: MIT License",
]
requires-python = ">=3.5"
dynamic = ['version', 'description']
# modification
dependencies = [
"torch @ https://download.pytorch.org/whl/cu111/torch-1.9.1%2Bcu111-cp38-cp38-win_amd64.whl", # link obtained from https://download.pytorch.org/whl/torch_stable.html
]
It looks like there are 8 files (4 Python versions and 2 platforms) just for "torch==1.9.1+cu111". I can see why PyTorch gives a user-friendly selection interface.
Yup, it's quite conceivable that some of this stuff might be easier with conda. Although looking at the pytorch installation instructions, it seems to be doing something similar - -c (for channel) is roughly the conda equivalent of pip's -f (find links) option, and I don't think you can specify a dependency on a package in a particuar channel either. But maybe conda has more palatable workarounds for this.
I'd guess, without knowing much about pytorch, that the rough idea is that you write code against the pytorch API, and users can run it with different flavours of pytorch depending on their hardware. Of course, that falls down if you know that it only works, or only performs acceptably, with certain flavours.
Another way pytorch could have chosen to make packages is to have different names depending on the flavour, something like torch-cuda11. That would make it easy to depend on the specific flavour, but hard to depend on pytorch in general.