DeepSpeed
DeepSpeed copied to clipboard
[BUG] manual build isn't installing requirements
Describe the bug
I made a fresh conda env and tried to manual build deepspeed and it failed:
$ DS_BUILD_CPU_ADAM=1 DS_BUILD_UTILS=1 pip install -e . --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check 2>&1 | tee build.log
Using pip 21.2.4 from /gpfswork/rech/six/commun/conda/cutting-edge/lib/python3.8/site-packages/pip (python 3.8)
Obtaining file:///gpfsssd/worksf/projects/rech/six/commun/code/tr8b-104B/DeepSpeed
/gpfswork/rech/six/commun/conda/cutting-edge/lib/python3.8/site-packages/pip/_internal/commands/install.py:229: UserWarning: Disabling all use of wheels due to the use of --build-option / --global-option / --install-option.
cmdoptions.check_install_build_global(options)
Running command python setup.py egg_info
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/gpfsssd/worksf/projects/rech/six/commun/code/tr8b-104B/DeepSpeed/setup.py", line 35, in <module>
from op_builder import ALL_OPS, get_default_compute_capabilities
File "/gpfsssd/worksf/projects/rech/six/commun/code/tr8b-104B/DeepSpeed/op_builder/__init__.py", line 8, in <module>
from .sparse_attn import SparseAttnBuilder
File "/gpfsssd/worksf/projects/rech/six/commun/code/tr8b-104B/DeepSpeed/op_builder/sparse_attn.py", line 6, in <module>
from packaging import version as pkg_version
ModuleNotFoundError: No module named 'packaging'
No CUDA runtime is found, using CUDA_HOME='/gpfslocalsys/cuda/11.4.3'
WARNING: Discarding file:///gpfsssd/worksf/projects/rech/six/commun/code/tr8b-104B/DeepSpeed. Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
The package is there: https://github.com/microsoft/DeepSpeed/blob/master/requirements/requirements.txt
but the build isn't installing the requirements.
This resolves the problem:
pip install packaging
but it'd be good to automate this at some point.
Absolutely not urgent.
@jeffra
Actually I then run into a bunch of other problems related to build_ext
as we have discussed elsewhere - dependencies ninja
and tqdm
won't build, even though binary wheels are available w/o any problems. Even installing those directly won't help!
So the solution is simple:
On a fresh conda environment:
- First install
deepspeed
normally:
pip install deepspeed
which takes care of all the dependencies correctly and swiftly!
- Only then proceed to pre-build manually and then everything works.
Perhaps it might be simpler to document this on the advanced install page, rather than trying to fix?
I stuck on Installing build dependencies: still running, you save my life
@stas00 - clearing through some old issues. This is somewhat stale, and we do have a conda env yml now as well. Closing this, but if others find this and have issues, please open a new issue and link this one for context.