DeepSpeed [BUG] DeepSpeed build issues on Windows

I encountered an error while installing according to the command in the document

useing: pip install deepspeed>=0.9.0

error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [16 lines of output] Traceback (most recent call last): File "", line 2, in File "", line 34, in File "C:\Users\FZG\AppData\Local\Temp\pip-install-1mws4aau\deepspeed_2c8d7d0d390249b982dc5bb7cc184ec0\setup.py", line 81, in cuda_major_ver, cuda_minor_ver = installed_cuda_version() File "C:\Users\FZG\AppData\Local\Temp\pip-install-1mws4aau\deepspeed_2c8d7d0d390249b982dc5bb7cc184ec0\op_builder\builder.py", line 43, in installed_cuda_version output = subprocess.check_output([cuda_home + "/bin/nvcc", "-V"], universal_newlines=True) File "e:\install\python3.9.5\lib\subprocess.py", line 424, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "e:\install\python3.9.5\lib\subprocess.py", line 505, in run with Popen(*popenargs, **kwargs) as process: File "e:\install\python3.9.5\lib\subprocess.py", line 951, in init self._execute_child(args, executable, preexec_fn, close_fds, File "e:\install\python3.9.5\lib\subprocess.py", line 1420, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, FileNotFoundError: [WinError 2] [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed

× Encountered error while generating package metadata. ╰─> See above for output.

My Environment： Windows10 CUDA Version: 12.1 python:3.9.5

Apr 21 '23 14:04 OUYANGSIR

Hi @OUYANGSIR - are you running in windows command prompt or in WSL? Have you followed the Windows install directions here?

Apr 21 '23 20:04 loadams

I ran it directly from the Command Prompt. This problem occurred,

Apr 24 '23 19:04 OUYANGSIR

Please just use pip install deepspeed. “>=0.9” not a command,just means the recommeded version.

May 25 '23 03:05 GalaxyHe2023

That's correct. @OUYANGSIR - are you seeing this with pip install deepspeed as well (that will give you the latest which is >0.9.0 anyway).

If you are still hitting this, please comment or re-open the issue. But otherwise we will assume this is resolved.

Jun 06 '23 17:06 loadams

Hi everyone, i happen to have this exact same issue running "pip install deepspeed" inside an activated venv on windows 10. (using cmd and python 3.10.9)

(venv) (base) B:\AI\img\kohya_ss4\kohya_ss\venv\Scripts>pip install deepspeed Collecting deepspeed Using cached deepspeed-0.11.1.tar.gz (1.1 MB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [16 lines of output] Traceback (most recent call last): File "", line 2, in File "", line 34, in File "C:\Users\Nicolas\AppData\Local\Temp\pip-install-89a6y6l5\deepspeed_3b89da14f7b44dd5adb7b5a0f78e6b29\setup.py", line 100, in cuda_major_ver, cuda_minor_ver = installed_cuda_version() File "C:\Users\Nicolas\AppData\Local\Temp\pip-install-89a6y6l5\deepspeed_3b89da14f7b44dd5adb7b5a0f78e6b29\op_builder\builder.py", line 43, in installed_cuda_version output = subprocess.check_output([cuda_home + "/bin/nvcc", "-V"], universal_newlines=True) File "C:\Python310\lib\subprocess.py", line 421, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "C:\Python310\lib\subprocess.py", line 503, in run with Popen(*popenargs, **kwargs) as process: File "C:\Python310\lib\subprocess.py", line 971, in init self._execute_child(args, executable, preexec_fn, close_fds, File "C:\Python310\lib\subprocess.py", line 1440, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, FileNotFoundError: [WinError 2] Le fichier spécifié est introuvable [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed

× Encountered error while generating package metadata. ╰─> See above for output.

if anyone knows how to explain this issue and how to solve it let me know. Thanks !

Oct 10 '23 23:10 LeF0URBE

Hi @LeF0URBE - there are a set of special Windows install directions on the repo, have you tried those?

If not, we know there are some issues with this, that we are working on, but Windows support is not necessarily all there. However, on Windows we do know that WSL works well, and if you're able to use that, that's currently recommended as you can use all features of DeepSpeed there too. Does that help? If not, can you open a new issue?

Oct 11 '23 17:10 loadams

@loadams I think I can tell you why so many people are having issues with Windows setup.

The instructions are pretty unclear, anyway, and I would suggest someone at MS may like to try a fresh setup on a non-development system and test out the instructions.

The main issue is this from the instructions https://github.com/microsoft/DeepSpeed#windows:

Install pytorch, such as pytorch 1.8 + cuda 11.1

If someone on windows is running CUDA apps with their standard Nvdia drivers and they do a "pip show torch", it will quite happily tell them they have both pytorch AND cuda! (as below). All your other CUDA apps/python scripts that use Cuda work fine on Windows, so It must be something wrong with DeepSpeed......... right?

Well its a bit of both, but mainly the instructions are a bit lacking in depth.

If I just run through the instructions it still fails! You can see in the below image, its failing on CUDA_HOME environment variable and therefore failing to run NVCC.

First off, I don't think NVCC is included with the standard Nvidia Windows Driver suite that you use for most Cuda apps, so you need to install the Nvidia Cuda Toolkit. (EDIT - I have just uninstalled my Cuda toolkit and gone back to the Nvidia driver, and can confirm you DONT get NVCC with the standard Nvidia Driver on Windows).

That will get you NVCC installed on your system. However there is another issue. As standard, the install routine for the Nvidia Cuda Toolkit does not appear to create the CUDA_HOME environment variable either (even upon reboot). For anyone who needs to check, you can check your environment variables, you can open a command prompt and type set which will list them off. It will be the same as CUDA_PATH environment variable that it DOES create.

(Yes I know 12.3 of CUDA is not supported, as I show below, I was just in that python environment when I took the screenshot).

As I have installed the Cuda Toolkit 12.3, I can set my CUDA_HOME with to be the same as my CUDA_PATH environment variable, using the following command at the command prompt. The Nvidia install on windows only creates CUDA_PATH, but the DeepSpeed install is wanting CUDA_HOME environment variable (they are both the same path) :

set CUDA_HOME=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3

Once you have done that, the install will continue on, though I still personally have other issues yet to look at.

The instructions on the front page for Windows should at least be this https://github.com/microsoft/DeepSpeed/issues/4729

Thanks

Nov 26 '23 00:11 erew123

Hi @erew123 - thanks for your comment on that, I will try to grab a fresh Windows machine and test the steps, and then we can get that PR reviewed/merged. For now I'll point another user to this comment.

Nov 28 '23 16:11 loadams

DeepSpeed DeepSpeed copied to clipboard

[BUG] DeepSpeed build issues on Windows

DeepSpeed
DeepSpeed copied to clipboard