DeepSpeed
DeepSpeed copied to clipboard
subprocess.CalledProcessError: Command '['which', 'c++']' returned non-zero exit status 1.
I am trying to train a masked language model from huggingface Trainer API using deepspeed. The training is succesfull while I train on ubuntu operating system while I try to train on Centos I faced this issue.
What is the ouput of running which c++
in your terminal?
What is the ouput of running
which c++
in your terminal?
@tjruwase @pudasainishushant
Sorry, I met the same problem. Have you ever found the solution?
And then I run the command which c++
, it outputs the nothing.
The ds_report
shows strange with lots of no.
--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
runtime if needed. Op compatibility means that your system
meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------op name ................ installed .. compatible
--------------------------------------------------
[WARNING] async_io requires the dev libaio .so object and headers but these were not found.
[WARNING] async_io: please install the libaio-dev package with apt
[WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
async_io ............... [NO] ....... [NO]
cpu_adagrad ............ [NO] ....... [OKAY]
cpu_adam ............... [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
fused_lamb ............. [NO] ....... [OKAY]
quantizer .............. [NO] ....... [OKAY]
random_ltd ............. [NO] ....... [OKAY]
[WARNING] please install triton==1.0.0 if you want to use sparse attention
sparse_attn ............ [NO] ....... [NO]
spatial_inference ...... [NO] ....... [OKAY]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
transformer_inference .. [NO] ....... [OKAY]
utils .................. [NO] ....... [OKAY]
--------------------------------------------------
DeepSpeed general environment info:
torch install path ............... ['/opt/conda/lib/python3.10/site-packages/torch']
torch version .................... 1.13.1
deepspeed install path ........... ['/opt/conda/lib/python3.10/site-packages/deepspeed']
deepspeed info ................... 0.8.3, unknown, unknown
torch cuda version ............... 11.6
torch hip version ................ None
nvcc version ..................... 11.6
deepspeed wheel compiled w. ...... torch 1.13, cuda 11.6
Thanks for your reply.
What is the ouput of running
which c++
in your terminal?@tjruwase @pudasainishushant Sorry, I met the same problem. Have you ever found the solution? And then I run the command
which c++
, it outputs the nothing. Theds_report
shows strange with lots of no.-------------------------------------------------- DeepSpeed C++/CUDA extension op report -------------------------------------------------- NOTE: Ops not installed will be just-in-time (JIT) compiled at runtime if needed. Op compatibility means that your system meet the required dependencies to JIT install the op. -------------------------------------------------- JIT compiled ops requires ninja ninja .................. [OKAY] --------------------------------------------------op name ................ installed .. compatible -------------------------------------------------- [WARNING] async_io requires the dev libaio .so object and headers but these were not found. [WARNING] async_io: please install the libaio-dev package with apt [WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found. async_io ............... [NO] ....... [NO] cpu_adagrad ............ [NO] ....... [OKAY] cpu_adam ............... [NO] ....... [OKAY] fused_adam ............. [NO] ....... [OKAY] fused_lamb ............. [NO] ....... [OKAY] quantizer .............. [NO] ....... [OKAY] random_ltd ............. [NO] ....... [OKAY] [WARNING] please install triton==1.0.0 if you want to use sparse attention sparse_attn ............ [NO] ....... [NO] spatial_inference ...... [NO] ....... [OKAY] transformer ............ [NO] ....... [OKAY] stochastic_transformer . [NO] ....... [OKAY] transformer_inference .. [NO] ....... [OKAY] utils .................. [NO] ....... [OKAY] -------------------------------------------------- DeepSpeed general environment info: torch install path ............... ['/opt/conda/lib/python3.10/site-packages/torch'] torch version .................... 1.13.1 deepspeed install path ........... ['/opt/conda/lib/python3.10/site-packages/deepspeed'] deepspeed info ................... 0.8.3, unknown, unknown torch cuda version ............... 11.6 torch hip version ................ None nvcc version ..................... 11.6 deepspeed wheel compiled w. ...... torch 1.13, cuda 11.6
Thanks for your reply.
please install the C++ pkgs on your server e.g. "sudo yum install gcc-c++"
@zx12671 I do 'sudo apt-get install g++',but I had the error.Do you have solutions to solve it ?
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
libc6-dev : Depends: libc6 (= 2.31-0ubuntu9.10) but 2.35-0ubuntu3.1 is to be installed
Depends: libc-dev-bin (= 2.31-0ubuntu9.10) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
Hi @summer-silence - you will need to follow the directions from that error and also install one of the packages that is listed there, libc6 for example.