🐛 Describe the bug
Traceback (most recent call last):
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/site-packages/colossalai/kernel/op_builder/builder.py", line 161, in load
op_module = self.import_op()
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/site-packages/colossalai/kernel/op_builder/builder.py", line 109, in import_op
return importlib.import_module(self.prebuilt_import_path)
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'colossalai._C.fused_optim'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1900, in _run_ninja_build
subprocess.run(
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "train_sft.py", line 221, in
train(args)
File "train_sft.py", line 89, in train
Traceback (most recent call last):
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/site-packages/colossalai/kernel/op_builder/builder.py", line 161, in load
optim = HybridAdam(model.parameters(), lr=args.lr, clipping_norm=1.0)
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/site-packages/colossalai/nn/optimizer/hybrid_adam.py", line 87, in init
Traceback (most recent call last):
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/site-packages/colossalai/kernel/op_builder/builder.py", line 161, in load
op_module = self.import_op()
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/site-packages/colossalai/kernel/op_builder/builder.py", line 109, in import_op
return importlib.import_module(self.prebuilt_import_path)
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/importlib/init.py", line 127, in import_module
op_module = self.import_op()
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/site-packages/colossalai/kernel/op_builder/builder.py", line 109, in import_op
fused_optim = FusedOptimBuilder().load()
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/site-packages/colossalai/kernel/op_builder/builder.py", line 189, in load
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
return importlib.import_module(self.prebuilt_import_path)
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/importlib/init.py", line 127, in import_module
op_module = load(
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1284, in load
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
return _jit_compile(
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1508, in _jit_compile
File "", line 991, in _find_and_load
File "", line 991, in _find_and_load
_write_ninja_file_and_build_library(
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1623, in _write_ninja_file_and_build_library
File "", line 973, in _find_and_load_unlocked
_run_ninja_build(
File "/mnt/afs/zhangaqiang/conda_envs/cloud-ai-lab/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build
File "", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'colossalai._C.fused_optim'
Environment
torch 1.13
colossalai==0.3.3
coati==1.0.0
Thank you! I uninstalled and re-installed colossalai but It still gives the same error.
Here is the complete error log:
WARNING:torch.distributed.run:
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/shardformer/layer/normalization.py:45: UserWarning: Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel
warnings.warn("Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel")
/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/shardformer/layer/normalization.py:45: UserWarning: Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel
warnings.warn("Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel")
/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/initialize.py:48: UserWarning: config is deprecated and will be removed soon.
warnings.warn("config is deprecated and will be removed soon.")
[03/18/24 15:27:39] INFO colossalai - colossalai - INFO: /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/initialize.py:67 launch
INFO colossalai - colossalai - INFO: Distributed environment is initialized, world size: 2
[2024-03-18 15:27:39] Experiment directory created at ./outputs/019-DiT-XL-2
[2024-03-18 15:27:39] Added key: store_based_barrier_key:2 to store for rank: 0
[2024-03-18 15:27:39] Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes.
[2024-03-18 15:27:39] Added key: store_based_barrier_key:3 to store for rank: 0
[2024-03-18 15:27:39] Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes.
[2024-03-18 15:27:39] Added key: store_based_barrier_key:4 to store for rank: 0
[2024-03-18 15:27:39] Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes.
[2024-03-18 15:27:47] Model params: 642.76 M
[extension] Compiling the JIT cpu_adam_x86 kernel during runtime now
[extension] Compiling the JIT cpu_adam_x86 kernel during runtime now
[extension] Time taken to compile cpu_adam_x86 op: 22.748290538787842 seconds
[extension] Compiling the JIT fused_optim_cuda kernel during runtime now
Traceback (most recent call last):
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/cpp_extension.py", line 128, in load
op_kernel = self.import_op()
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/cpp_extension.py", line 58, in import_op
return importlib.import_module(self.prebuilt_import_path)
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'colossalai._C.fused_optim_cuda'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1900, in _run_ninja_build
subprocess.run(
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/jd_data/ColossalAI/OpenDiT/train.py", line 411, in
main(args)
File "/data/jd_data/ColossalAI/OpenDiT/train.py", line 206, in main
optimizer = HybridAdam(
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/nn/optimizer/hybrid_adam.py", line 88, in init
fused_optim = FusedOptimizerLoader().load()
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/kernel_loader.py", line 81, in load
return usable_exts[0].load()
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/cpp_extension.py", line 132, in load
op_kernel = self.build_jit()
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/cuda_extension.py", line 79, in build_jit
op_kernel = load(
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
return jit_compile(
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1508, in jit_compile
write_ninja_file_and_build_library(
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1623, in write_ninja_file_and_build_library
run_ninja_build(
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1916, in run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'fused_optim_cuda': [1/7] /data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_sgd_kernel.cu -o multi_tensor_sgd_kernel.cuda.o
FAILED: multi_tensor_sgd_kernel.cuda.o
/data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_sgd_kernel.cu -o multi_tensor_sgd_kernel.cuda.o
In file included from /usr/include/cuda_runtime.h:83,
from :
/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
[2/7] /data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_adam.cu -o multi_tensor_adam.cuda.o
FAILED: multi_tensor_adam.cuda.o
/data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_adam.cu -o multi_tensor_adam.cuda.o
In file included from /usr/include/cuda_runtime.h:83,
from :
/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
[3/7] /data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_scale_kernel.cu -o multi_tensor_scale_kernel.cuda.o
FAILED: multi_tensor_scale_kernel.cuda.o
/data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_scale_kernel.cu -o multi_tensor_scale_kernel.cuda.o
In file included from /usr/include/cuda_runtime.h:83,
from :
/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
[4/7] /data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_l2norm_kernel.cu -o multi_tensor_l2norm_kernel.cuda.o
FAILED: multi_tensor_l2norm_kernel.cuda.o
/data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_l2norm_kernel.cu -o multi_tensor_l2norm_kernel.cuda.o
In file included from /usr/include/cuda_runtime.h:83,
from :
/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
[5/7] /data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_lamb.cu -o multi_tensor_lamb.cuda.o
FAILED: multi_tensor_lamb.cuda.o
/data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_lamb.cu -o multi_tensor_lamb.cuda.o
In file included from /usr/include/cuda_runtime.h:83,
from :
/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
[6/7] c++ -MMD -MF colossal_C_frontend.o.d -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/colossal_C_frontend.cpp -o colossal_C_frontend.o
ninja: build stopped: subcommand failed.
[extension] Time taken to compile cpu_adam_x86 op: 38.66856265068054 seconds
[extension] Compiling the JIT fused_optim_cuda kernel during runtime now
Traceback (most recent call last):
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/cpp_extension.py", line 128, in load
op_kernel = self.import_op()
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/cpp_extension.py", line 58, in import_op
return importlib.import_module(self.prebuilt_import_path)
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'colossalai._C.fused_optim_cuda'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1900, in _run_ninja_build
subprocess.run(
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/jd_data/ColossalAI/OpenDiT/train.py", line 411, in
main(args)
File "/data/jd_data/ColossalAI/OpenDiT/train.py", line 206, in main
optimizer = HybridAdam(
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/nn/optimizer/hybrid_adam.py", line 88, in init
fused_optim = FusedOptimizerLoader().load()
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/kernel_loader.py", line 81, in load
return usable_exts[0].load()
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/cpp_extension.py", line 132, in load
op_kernel = self.build_jit()
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/cuda_extension.py", line 79, in build_jit
op_kernel = load(
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
return jit_compile(
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1508, in jit_compile
write_ninja_file_and_build_library(
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1623, in write_ninja_file_and_build_library
run_ninja_build(
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1916, in run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'fused_optim_cuda': [1/6] /data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_lamb.cu -o multi_tensor_lamb.cuda.o
FAILED: multi_tensor_lamb.cuda.o
/data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_lamb.cu -o multi_tensor_lamb.cuda.o
In file included from /usr/include/cuda_runtime.h:83,
from :
/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
[2/6] /data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_l2norm_kernel.cu -o multi_tensor_l2norm_kernel.cuda.o
FAILED: multi_tensor_l2norm_kernel.cuda.o
/data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_l2norm_kernel.cu -o multi_tensor_l2norm_kernel.cuda.o
In file included from /usr/include/cuda_runtime.h:83,
from :
/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
[3/6] /data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_sgd_kernel.cu -o multi_tensor_sgd_kernel.cuda.o
FAILED: multi_tensor_sgd_kernel.cuda.o
/data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_sgd_kernel.cu -o multi_tensor_sgd_kernel.cuda.o
In file included from /usr/include/cuda_runtime.h:83,
from :
/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
[4/6] /data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_adam.cu -o multi_tensor_adam.cuda.o
FAILED: multi_tensor_adam.cuda.o
/data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_adam.cu -o multi_tensor_adam.cuda.o
In file included from /usr/include/cuda_runtime.h:83,
from :
/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
[5/6] /data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_scale_kernel.cu -o multi_tensor_scale_kernel.cuda.o
FAILED: multi_tensor_scale_kernel.cuda.o
/data/jd_data/miniconda3/envs/opendit/bin/nvcc -DTORCH_EXTENSION_NAME=fused_optim_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/TH -isystem /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/include/THC -isystem /data/jd_data/miniconda3/envs/opendit/include -isystem /data/jd_data/miniconda3/envs/opendit/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -std=c++14 -c /data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/multi_tensor_scale_kernel.cu -o multi_tensor_scale_kernel.cuda.o
In file included from /usr/include/cuda_runtime.h:83,
from :
/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
ninja: build stopped: subcommand failed.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1162234) of binary: /data/jd_data/miniconda3/envs/opendit/bin/python
Traceback (most recent call last):
File "/data/jd_data/miniconda3/envs/opendit/bin/torchrun", line 33, in
sys.exit(load_entry_point('torch==1.13.1', 'console_scripts', 'torchrun')())
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/distributed/run.py", line 762, in main
run(args)
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/distributed/run.py", line 753, in run
elastic_launch(
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/data/jd_data/miniconda3/envs/opendit/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
train.py FAILED
Failures:
[1]:
time : 2024-03-18_15:28:37
host : zju-ESC8000A-E11
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 1162235)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Root Cause (first observed failure):
[0]:
time : 2024-03-18_15:28:37
host : zju-ESC8000A-E11
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 1162234)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html