[BUG]: runningCUDA_EXT=1 pip install . error: [Errno 2] No such file or directory: 'which': 'which'
π Describe the bug
`Building wheels for collected packages: colossalai Building wheel for colossalai (setup.py) ... error error: subprocess-exited-with-error
Γ python setup.py bdist_wheel did not run successfully. β exit code: 1 β°β> [30 lines of output]
torch.__version__ = 1.13.1+cu117
Compiling cuda extensions with
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Mon_May__3_19:15:13_PDT_2021
Cuda compilation tools, release 11.3, V11.3.109
Build cuda_11.3.r11.3/compiler.29920130_0
from /usr/local/cuda/bin
Warning: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries. Pytorch binaries were compiled with Cuda 11.7.
In some cases, a minor-version mismatch will not cause later errors: https://github.com/NVIDIA/apex/pull/323#discussion_r287021798.
===== Building Extension cpu_adam =====
===== Building Extension fused_optim =====
===== Building Extension moe =====
===== Building Extension multi_head_attn =====
===== Building Extension scaled_masked_softmax =====
===== Building Extension scaled_upper_triangle_masked_softmax =====
===== Building Extension layernorm =====
running bdist_wheel
running build
running build_py
copying colossalai/version.py -> build/lib.linux-x86_64-3.7/colossalai
running build_ext
error: [Errno 2] No such file or directory: 'which': 'which'
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for colossalai Running setup.py clean for colossalai Failed to build colossalai Installing collected packages: colossalai Running setup.py install for colossalai ... error error: subprocess-exited-with-error
Γ Running setup.py install for colossalai did not run successfully. β exit code: 1 β°β> [780 lines of output]
torch.__version__ = 1.13.1+cu117
Compiling cuda extensions with
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Mon_May__3_19:15:13_PDT_2021
Cuda compilation tools, release 11.3, V11.3.109
Build cuda_11.3.r11.3/compiler.29920130_0
from /usr/local/cuda/bin
Warning: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries. Pytorch binaries were compiled with Cuda 11.7.
In some cases, a minor-version mismatch will not cause later errors: https://github.com/NVIDIA/apex/pull/323#discussion_r287021798.
===== Building Extension cpu_adam =====
===== Building Extension fused_optim =====
===== Building Extension moe =====
===== Building Extension multi_head_attn =====
===== Building Extension scaled_masked_softmax =====
===== Building Extension scaled_upper_triangle_masked_softmax =====
===== Building Extension layernorm =====
running install
running build
running build_py
creating build/lib.linux-x86_64-3.7
creating build/lib.linux-x86_64-3.7/colossalai
copying colossalai/__init__.py -> build/lib.linux-x86_64-3.7/colossalai
copying colossalai/constants.py -> build/lib.linux-x86_64-3.7/colossalai
copying colossalai/core.py -> build/lib.linux-x86_64-3.7/colossalai
copying colossalai/global_variables.py -> build/lib.linux-x86_64-3.7/colossalai
copying colossalai/initialize.py -> build/lib.linux-x86_64-3.7/colossalai
copying colossalai/version.py -> build/lib.linux-x86_64-3.7/colossalai
creating build/lib.linux-x86_64-3.7/op_builder
copying op_builder/__init__.py -> build/lib.linux-x86_64-3.7/op_builder
copying op_builder/builder.py -> build/lib.linux-x86_64-3.7/op_builder
copying op_builder/cpu_adam.py -> build/lib.linux-x86_64-3.7/op_builder
copying op_builder/fused_optim.py -> build/lib.linux-x86_64-3.7/op_builder
copying op_builder/layernorm.py -> build/lib.linux-x86_64-3.7/op_builder
copying op_builder/moe.py -> build/lib.linux-x86_64-3.7/op_builder
copying op_builder/multi_head_attn.py -> build/lib.linux-x86_64-3.7/op_builder
copying op_builder/scaled_masked_softmax.py -> build/lib.linux-x86_64-3.7/op_builder
copying op_builder/scaled_upper_triangle_masked_softmax.py -> build/lib.linux-x86_64-3.7/op_builder
copying op_builder/utils.py -> build/lib.linux-x86_64-3.7/op_builder
creating build/lib.linux-x86_64-3.7/colossalai/_C
copying colossalai/_C/__init__.py -> build/lib.linux-x86_64-3.7/colossalai/_C
creating build/lib.linux-x86_64-3.7/colossalai/amp
copying colossalai/amp/__init__.py -> build/lib.linux-x86_64-3.7/colossalai/amp
copying colossalai/amp/amp_type.py -> build/lib.linux-x86_64-3.7/colossalai/amp
creating build/lib.linux-x86_64-3.7/colossalai/auto_parallel
copying colossalai/auto_parallel/__init__.py -> build/lib.linux-x86_64-3.7/colossalai/auto_parallel
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_view_handler.py -> build/lib.linux-x86_64-3.7/tests/test_auto_parallel/test_tensor_shard/test_node_handler
....... copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_where_handler.py -> build/lib.linux-x86_64-3.7/tests/test_auto_parallel/test_tensor_shard/test_node_handler copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/utils.py -> build/lib.linux-x86_64-3.7/tests/test_auto_parallel/test_tensor_shard/test_node_handler creating build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/colossal_C_frontend.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/compat.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/cpu_adam.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/cpu_adam.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/layer_norm_cuda.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/layer_norm_cuda_kernel.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/moe_cuda.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/moe_cuda_kernel.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multi_tensor_adam.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multi_tensor_apply.cuh -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multi_tensor_l2norm_kernel.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multi_tensor_lamb.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multi_tensor_scale_kernel.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multi_tensor_sgd_kernel.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multihead_attention_1d.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multihead_attention_1d.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/scaled_masked_softmax.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/scaled_masked_softmax.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/scaled_masked_softmax_cuda.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax_cuda.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/type_shim.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc creating build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels creating build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/block_reduce.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/context.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/cross_entropy_layer.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/cublas_wrappers.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/cuda_util.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/dropout.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/feed_forward.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/kernels.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/ls_cub.cuh -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/normalize_layer.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/softmax.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/strided_batch_gemm.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include running build_ext error: [Errno 2] No such file or directory: 'which': 'which' [end of output]
note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure
Γ Encountered error while trying to install package. β°β> colossalai
note: This is an issue with the package mentioned above, not pip. hint: See above for output from the failure.` When I run "CUDA_EXT=1 pip install ." I get above error message
Environment
NVIDIA-SMI 515.86.01 Driver Version: 515.86.01 CUDA Version: 11.7
Bot detected the issue body's language is not English, translate it automatically. π―ππ»π§βπ€βπ§π«π§πΏβπ€βπ§π»π©πΎβπ€βπ¨πΏπ¬πΏ
Title: [BUG]:
Hi, PyTorch might not be compatible with your cuda 11.7. (source)
Can you please downgrade it or change to another environment?
Hi, it seems that Python cannot infer your home directory correctly. Can you provide the output for this:
import os
os.path.expanduser('~')
We have updated a lot. This issue was closed due to inactivity. Thanks.