Llama3.1-Finetuning
Llama3.1-Finetuning copied to clipboard
RuntimeError: Error building extension 'fused_adam'
一直报错fused_adam的错误,我的设备是双卡2080ti,版本和requirement一样
FAILED: multi_tensor_adam.cuda.o /usr/bin/nvcc -DTORCH_EXTENSION_NAME=fused_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/csrc/includes -I/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/csrc/adam -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include/TH -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include/THC -isystem /home/patrickstar/miniconda3/envs/pytorch/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -lineinfo --use_fast_math -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_75,code=compute_75 -std=c++17 -c /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/csrc/adam/multi_tensor_adam.cu -o multi_tensor_adam.cuda.o gcc: fatal error: cannot execute 'cc1plus': execvp: No such file or directory compilation terminated. nvcc fatal : Failed to preprocess host compiler properties. [2/3] c++ -MMD -MF fused_adam_frontend.o.d -DTORCH_EXTENSION_NAME=fused_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -I/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/csrc/includes -I/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/csrc/adam -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include/TH -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include/THC -isystem /home/patrickstar/miniconda3/envs/pytorch/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -std=c++17 -g -Wno-reorder -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -c /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/csrc/adam/fused_adam_frontend.cpp -o fused_adam_frontend.o ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build subprocess.run( File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/patrickstar/llm_learn/Llama3-Finetuning/script/../finetune_llama3.py", line 448, in