RWKV-LM icon indicating copy to clipboard operation
RWKV-LM copied to clipboard

训练到这一步报错 build.ninja...

Open hopeforus opened this issue 1 year ago • 4 comments

mitting ninja build file /home/hope/.cache/torch_extensions/py310_cu117/wkv_1024/build.ninja... Building extension module wkv_1024... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/2] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=wkv_1024 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/TH -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/THC -isystem /home/hope/miniconda3/envs/rwkv/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -res-usage --maxrregcount 60 --use_fast_math -O3 -Xptxas -O3 --extra-device-vectorization -DTmax=1024 -std=c++14 -c /home/hope/work/RWKV-LM/RWKV-v4neo/cuda/wkv_cuda.cu -o wkv_cuda.cuda.o FAILED: wkv_cuda.cuda.o /usr/bin/nvcc -DTORCH_EXTENSION_NAME=wkv_1024 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/TH -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/THC -isystem /home/hope/miniconda3/envs/rwkv/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -res-usage --maxrregcount 60 --use_fast_math -O3 -Xptxas -O3 --extra-device-vectorization -DTmax=1024 -std=c++14 -c /home/hope/work/RWKV-LM/RWKV-v4neo/cuda/wkv_cuda.cu -o wkv_cuda.cuda.o In file included from /usr/include/cuda_runtime.h:83, from : /usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported! 138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported! | ^~~~~ ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1900, in _run_ninja_build subprocess.run( File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/hope/work/RWKV-LM/RWKV-v4neo/train.py", line 307, in from src.model import RWKV File "/home/hope/work/RWKV-LM/RWKV-v4neo/src/model.py", line 80, in wkv_cuda = load(name=f"wkv_{T_MAX}", sources=["cuda/wkv_op.cpp", "cuda/wkv_cuda.cu"], verbose=True, extra_cuda_cflags=["-res-usage", "--maxrregcount 60", "--use_fast_math", "-O3", "-Xptxas -O3", "--extra-device-vectorization", f"-DTmax={T_MAX}"]) File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load return _jit_compile( File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1508, in _jit_compile _write_ninja_file_and_build_library( File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1623, in _write_ninja_file_and_build_library _run_ninja_build( File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'wkv_1024'

hopeforus avatar Jun 20 '23 02:06 hopeforus

环境问题,如果不会解决 ,建议下载个docker 继承环境来测试:https://zhuanlan.zhihu.com/p/616986651

gg22mm avatar Jun 21 '23 01:06 gg22mm

多谢啦

hopeforus avatar Jun 22 '23 07:06 hopeforus

我也遇到同样的问题,请问你通过配置环境解决了吗

HuXinjing avatar Nov 23 '23 08:11 HuXinjing

I removed "-Xptxas -O3" from wkv6_cuda and that solved the problem.

Lixuanhe avatar May 20 '24 12:05 Lixuanhe