DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

[BUG] manual build isn't installing requirements

Open stas00 opened this issue 3 years ago • 2 comments

Describe the bug

I made a fresh conda env and tried to manual build deepspeed and it failed:

$ DS_BUILD_CPU_ADAM=1 DS_BUILD_UTILS=1 pip install -e . --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check 2>&1 | tee build.log
Using pip 21.2.4 from /gpfswork/rech/six/commun/conda/cutting-edge/lib/python3.8/site-packages/pip (python 3.8)
Obtaining file:///gpfsssd/worksf/projects/rech/six/commun/code/tr8b-104B/DeepSpeed
/gpfswork/rech/six/commun/conda/cutting-edge/lib/python3.8/site-packages/pip/_internal/commands/install.py:229: UserWarning: Disabling all use of wheels due to the use of --build-option / --global-option / --install-option.
  cmdoptions.check_install_build_global(options)
    Running command python setup.py egg_info
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/gpfsssd/worksf/projects/rech/six/commun/code/tr8b-104B/DeepSpeed/setup.py", line 35, in <module>
        from op_builder import ALL_OPS, get_default_compute_capabilities
      File "/gpfsssd/worksf/projects/rech/six/commun/code/tr8b-104B/DeepSpeed/op_builder/__init__.py", line 8, in <module>
        from .sparse_attn import SparseAttnBuilder
      File "/gpfsssd/worksf/projects/rech/six/commun/code/tr8b-104B/DeepSpeed/op_builder/sparse_attn.py", line 6, in <module>
        from packaging import version as pkg_version
    ModuleNotFoundError: No module named 'packaging'
    No CUDA runtime is found, using CUDA_HOME='/gpfslocalsys/cuda/11.4.3'
WARNING: Discarding file:///gpfsssd/worksf/projects/rech/six/commun/code/tr8b-104B/DeepSpeed. Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.


The package is there: https://github.com/microsoft/DeepSpeed/blob/master/requirements/requirements.txt

but the build isn't installing the requirements.

This resolves the problem:

pip install packaging

but it'd be good to automate this at some point.

Absolutely not urgent.

@jeffra

stas00 avatar Feb 10 '22 01:02 stas00

Actually I then run into a bunch of other problems related to build_ext as we have discussed elsewhere - dependencies ninja and tqdm won't build, even though binary wheels are available w/o any problems. Even installing those directly won't help!

So the solution is simple:

On a fresh conda environment:

  1. First install deepspeed normally:
pip install deepspeed

which takes care of all the dependencies correctly and swiftly!

  1. Only then proceed to pre-build manually and then everything works.

Perhaps it might be simpler to document this on the advanced install page, rather than trying to fix?

stas00 avatar Feb 10 '22 01:02 stas00

I stuck on Installing build dependencies: still running, you save my life

631068264 avatar Jun 09 '23 08:06 631068264

@stas00 - clearing through some old issues. This is somewhat stale, and we do have a conda env yml now as well. Closing this, but if others find this and have issues, please open a new issue and link this one for context.

loadams avatar Aug 14 '23 20:08 loadams