llm-foundry
llm-foundry copied to clipboard
error 'Getting requirements to build wheel'... is the docker image okay??
Hi all,
Upon the installation step of setup, I am getting an error 'Getting requirements to build wheel'
I am using the mosaicml/pytorch:1.13.1_cu117-python3.10-ubuntu20.04 docker image as recommended. It appears that it can't get the requirements?
To replicate: vast.ai 4xA6000 docker: mosaicml/pytorch:1.13.1_cu117-python3.10-ubuntu20.04 Ubuntu 20.04.6 LTS (GNU/Linux 5.15.0-71-generic x86_64) sudo apt update -y git clone repo cd llm-foundry pip install -e ".[gpu]"
Running setup.py with python from CL gives the same error. Any advice?
Full output:
`Obtaining file:///root/llm-foundry Installing build dependencies ... done Checking if build backend supports build_editable ... done Getting requirements to build editable ... done Installing backend dependencies ... done Preparing editable metadata (pyproject.toml) ... done Collecting xentropy-cuda-lib@ git+https://github.com/HazyResearch/[email protected]#subdirectory=csrc/xentropy (from llm-foundry==0.1.0) Cloning https://github.com/HazyResearch/flash-attention.git (to revision v0.2.8) to /tmp/pip-install-g3gxu5nw/xentropy-cuda-lib_4b7f2cb3f8074e3c9cedcc0980ea12ea Running command git clone --filter=blob:none --quiet https://github.com/HazyResearch/flash-attention.git /tmp/pip-install-g3gxu5nw/xentropy-cuda-lib_4b7f2cb3f8074e3c9cedcc0980ea12ea Running command git checkout -q 33e0860c9c5667fded5af674882e731909096a7f Resolved https://github.com/HazyResearch/flash-attention.git to commit 33e0860c9c5667fded5af674882e731909096a7f Running command git submodule update --init --recursive -q Installing build dependencies ... done Getting requirements to build wheel ... error error: subprocess-exited-with-error
_ Getting requirements to build wheel did not run successfully.
_ exit code: 1
__> [17 lines of output]
Traceback (most recent call last):
File "/root/llm-foundry/llmfoundry-venv/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in
note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error
_ Getting requirements to build wheel did not run successfully. _ exit code: 1 __> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.`
if you are doing this within the virtual env, then make sure that while you create the venv you copy over all python packages from the system python. Torch is installed in system python but probably not in your new virtualenv.
Thank you for the reply. I tested it both in the venv and on root, to the same result. I also checked my -v for the packages and it seems like I have torch==1.13 in the venv as well.
could you please share the Dockerfile once?
Following up here @jewbot, if it helps this is the Dockerfile we use to build the mosaicml/pytorch:1.13.1_cu117-python3.10-ubuntu20.04 Docker image.
https://github.com/mosaicml/composer/blob/v0.14.1/docker/Dockerfile
Our internal workflow is exactly:
- pull Docker image
- run
pip install -e .[gpu] - run
composer train/train.py ...
and we run this on many different clouds (AWS, Oracle, GCP, on-premise) so if there is something going wrong with the process, I would suspect something about the system environment on vast.ai, or the CUDA drivers, or the way Docker is being run.
For me the problem was loading the venv as described in these installation instructions.
Hi, I needed to update the version of setuptools to 67.8.0 and it completed tghe setup script
@abhi-mosaic thank you!