DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

Windows 11 build error, cl.exe exited with error code 2

Open anto18671 opened this issue 2 years ago • 6 comments

Describe the bug csrc/transformer/inference/csrc/pt_binding.cpp(1977): note: voir la référence à l'instanciation de la fonction modèle 'std::vector<at::Tensor,std::allocatorat::Tensor> ds_rms_mlp_gemm(at::Tensor &,at::Tensor &,at::Tensor &,at::Tensor &,at::Tensor &,const float,at::Tensor &,at::Tensor &,bool,int,bool)' en cours de compilation error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\cl.exe' failed with exit code 2

To Reproduce Steps to reproduce the behavior:

  1. set DS_BUILD_AIO=0
  2. set DS_BUILD_SPARSE_ATTN=0
  3. python setup.py bdist_wheel

Expected behavior Install deepspeed

ds_report output Not recognize

**System info

  • OS: windows11
  • 1x rtx3070 mobile
  • Python 3.10.11
  • Cuda compilation tools, release 11.8, V11.8.89

anto18671 avatar May 04 '23 14:05 anto18671

Hi @anto18671 -

Can you please share more of the error log output? I see cl.exe failed with exit code 2.

Also can you confirm that you're running in an administrator command window? You can do that by running the build_win.bat script at the root of the repo?

loadams avatar May 04 '23 17:05 loadams

csrc/transformer/inference/includes\inference_context.h(169): note: utilisez '%zu' dans la chaîne de format csrc/transformer/inference/includes\inference_context.h(169): warning C4477: 'printf' : la chaîne de format '%lu' nécessite un argument de type 'unsigned long', mais l'argument variadique 2 est de type 'size_t' csrc/transformer/inference/includes\inference_context.h(169): note: utilisez '%zu' dans la chaîne de format csrc/transformer/inference/includes\inference_context.h(169): warning C4477: 'printf' : la chaîne de format '%lu' nécessite un argument de type 'unsigned long', mais l'argument variadique 3 est de type 'size_t' csrc/transformer/inference/includes\inference_context.h(169): note: utilisez '%zu' dans la chaîne de format csrc/transformer/inference/csrc/pt_binding.cpp(536): error C2398: Élément « 1 » : la conversion de « size_t » en « _Ty » nécessite une conversion restrictive with [ _Ty=int64_t ] csrc/transformer/inference/csrc/pt_binding.cpp(1977): note: voir la référence à l'instanciation de la fonction modèle 'std::vector<at::Tensor,std::allocatorat::Tensor> ds_softmax_context(at::Tensor &,at::Tensor &,int,bool,bool,int,float,bool,bool,int,bool,unsigned int,unsigned int,at::Tensor &)' en cours de compilation csrc/transformer/inference/csrc/pt_binding.cpp(537): error C2398: Élément « 2 » : la conversion de « size_t » en « _Ty » nécessite une conversion restrictive with [ _Ty=int64_t ] csrc/transformer/inference/csrc/pt_binding.cpp(545): error C2398: Élément « 1 » : la conversion de « size_t » en « _Ty » nécessite une conversion restrictive with [ _Ty=int64_t ] csrc/transformer/inference/csrc/pt_binding.cpp(546): error C2398: Élément « 2 » : la conversion de « size_t » en « _Ty » nécessite une conversion restrictive with [ _Ty=int64_t ] csrc/transformer/inference/csrc/pt_binding.cpp(1570): error C2398: Élément « 3 » : la conversion de « const size_t » en « _Ty » nécessite une conversion restrictive with [ _Ty=int64_t ] csrc/transformer/inference/csrc/pt_binding.cpp(1977): note: voir la référence à l'instanciation de la fonction modèle 'std::vector<at::Tensor,std::allocatorat::Tensor> ds_rms_mlp_gemm(at::Tensor &,at::Tensor &,at::Tensor &,at::Tensor &,at::Tensor &,const float,at::Tensor &,at::Tensor &,bool,int,bool)' en cours de compilation error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe' failed with exit code 2

I tried running the build_win.bat in admin, but i got the same error.

anto18671 avatar May 04 '23 17:05 anto18671

Thanks, @anto18671 - since exit code 2 is that the system cannot find the path to CL.exe, do you know if you installed python first or Visual Studio? Could you try re-installing python after getting the Visual Studio runtime, since that way the path should be able to be found here?

loadams avatar May 08 '23 19:05 loadams

Also anecdotally, I've seen that this tends to work on Windows with python 3.8.10 but not always with python 3.10, if you could try that as well and if that works we can debug more into it.

loadams avatar May 08 '23 19:05 loadams

I confirmed on Windows 11 that python 3.11 fails with this error but 3.8 works fine for me, so that is a solution we can use.

loadams avatar May 09 '23 18:05 loadams

This seems to be a similar issue, also recommending either Python 3.8 or an additional set of steps to take to fix the libs: https://stackoverflow.com/questions/71242919/pip-install-results-in-this-error-cl-exe-failed-with-exit-code-2

loadams avatar May 10 '23 18:05 loadams

Hi @anto18671 - I'm going to close this for now since I can't do any other confirmation on this, though if you do have time, please check the above solution and let me know if that fixes it. If so I can update our installation docs for Windows.

loadams avatar May 16 '23 18:05 loadams