ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[BUG]: runningCUDA_EXT=1 pip install . error: [Errno 2] No such file or directory: 'which': 'which'

Open scarydemon2 opened this issue 2 years ago β€’ 3 comments

πŸ› Describe the bug

`Building wheels for collected packages: colossalai Building wheel for colossalai (setup.py) ... error error: subprocess-exited-with-error

Γ— python setup.py bdist_wheel did not run successfully. β”‚ exit code: 1 ╰─> [30 lines of output]

  torch.__version__  = 1.13.1+cu117



  Compiling cuda extensions with
  nvcc: NVIDIA (R) Cuda compiler driver
  Copyright (c) 2005-2021 NVIDIA Corporation
  Built on Mon_May__3_19:15:13_PDT_2021
  Cuda compilation tools, release 11.3, V11.3.109
  Build cuda_11.3.r11.3/compiler.29920130_0
  from /usr/local/cuda/bin


  Warning: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries.  Pytorch binaries were compiled with Cuda 11.7.
  In some cases, a minor-version mismatch will not cause later errors:  https://github.com/NVIDIA/apex/pull/323#discussion_r287021798.
  ===== Building Extension cpu_adam =====
  ===== Building Extension fused_optim =====
  ===== Building Extension moe =====
  ===== Building Extension multi_head_attn =====
  ===== Building Extension scaled_masked_softmax =====
  ===== Building Extension scaled_upper_triangle_masked_softmax =====
  ===== Building Extension layernorm =====
  running bdist_wheel
  running build
  running build_py
  copying colossalai/version.py -> build/lib.linux-x86_64-3.7/colossalai
  running build_ext
  error: [Errno 2] No such file or directory: 'which': 'which'
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for colossalai Running setup.py clean for colossalai Failed to build colossalai Installing collected packages: colossalai Running setup.py install for colossalai ... error error: subprocess-exited-with-error

Γ— Running setup.py install for colossalai did not run successfully. β”‚ exit code: 1 ╰─> [780 lines of output]

  torch.__version__  = 1.13.1+cu117



  Compiling cuda extensions with
  nvcc: NVIDIA (R) Cuda compiler driver
  Copyright (c) 2005-2021 NVIDIA Corporation
  Built on Mon_May__3_19:15:13_PDT_2021
  Cuda compilation tools, release 11.3, V11.3.109
  Build cuda_11.3.r11.3/compiler.29920130_0
  from /usr/local/cuda/bin


  Warning: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries.  Pytorch binaries were compiled with Cuda 11.7.
  In some cases, a minor-version mismatch will not cause later errors:  https://github.com/NVIDIA/apex/pull/323#discussion_r287021798.
  ===== Building Extension cpu_adam =====
  ===== Building Extension fused_optim =====
  ===== Building Extension moe =====
  ===== Building Extension multi_head_attn =====
  ===== Building Extension scaled_masked_softmax =====
  ===== Building Extension scaled_upper_triangle_masked_softmax =====
  ===== Building Extension layernorm =====
  running install
  running build
  running build_py
  creating build/lib.linux-x86_64-3.7
  creating build/lib.linux-x86_64-3.7/colossalai
  copying colossalai/__init__.py -> build/lib.linux-x86_64-3.7/colossalai
  copying colossalai/constants.py -> build/lib.linux-x86_64-3.7/colossalai
  copying colossalai/core.py -> build/lib.linux-x86_64-3.7/colossalai
  copying colossalai/global_variables.py -> build/lib.linux-x86_64-3.7/colossalai
  copying colossalai/initialize.py -> build/lib.linux-x86_64-3.7/colossalai
  copying colossalai/version.py -> build/lib.linux-x86_64-3.7/colossalai
  creating build/lib.linux-x86_64-3.7/op_builder
  copying op_builder/__init__.py -> build/lib.linux-x86_64-3.7/op_builder
  copying op_builder/builder.py -> build/lib.linux-x86_64-3.7/op_builder
  copying op_builder/cpu_adam.py -> build/lib.linux-x86_64-3.7/op_builder
  copying op_builder/fused_optim.py -> build/lib.linux-x86_64-3.7/op_builder
  copying op_builder/layernorm.py -> build/lib.linux-x86_64-3.7/op_builder
  copying op_builder/moe.py -> build/lib.linux-x86_64-3.7/op_builder
  copying op_builder/multi_head_attn.py -> build/lib.linux-x86_64-3.7/op_builder
  copying op_builder/scaled_masked_softmax.py -> build/lib.linux-x86_64-3.7/op_builder
  copying op_builder/scaled_upper_triangle_masked_softmax.py -> build/lib.linux-x86_64-3.7/op_builder
  copying op_builder/utils.py -> build/lib.linux-x86_64-3.7/op_builder
  creating build/lib.linux-x86_64-3.7/colossalai/_C
  copying colossalai/_C/__init__.py -> build/lib.linux-x86_64-3.7/colossalai/_C
  creating build/lib.linux-x86_64-3.7/colossalai/amp
  copying colossalai/amp/__init__.py -> build/lib.linux-x86_64-3.7/colossalai/amp
  copying colossalai/amp/amp_type.py -> build/lib.linux-x86_64-3.7/colossalai/amp
  creating build/lib.linux-x86_64-3.7/colossalai/auto_parallel
  copying colossalai/auto_parallel/__init__.py -> build/lib.linux-x86_64-3.7/colossalai/auto_parallel
  copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_view_handler.py -> build/lib.linux-x86_64-3.7/tests/test_auto_parallel/test_tensor_shard/test_node_handler

....... copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_where_handler.py -> build/lib.linux-x86_64-3.7/tests/test_auto_parallel/test_tensor_shard/test_node_handler copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/utils.py -> build/lib.linux-x86_64-3.7/tests/test_auto_parallel/test_tensor_shard/test_node_handler creating build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/colossal_C_frontend.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/compat.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/cpu_adam.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/cpu_adam.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/layer_norm_cuda.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/layer_norm_cuda_kernel.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/moe_cuda.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/moe_cuda_kernel.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multi_tensor_adam.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multi_tensor_apply.cuh -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multi_tensor_l2norm_kernel.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multi_tensor_lamb.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multi_tensor_scale_kernel.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multi_tensor_sgd_kernel.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multihead_attention_1d.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/multihead_attention_1d.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/scaled_masked_softmax.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/scaled_masked_softmax.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/scaled_masked_softmax_cuda.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax.cpp -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax_cuda.cu -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc copying colossalai/kernel/cuda_native/csrc/type_shim.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc creating build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels creating build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/block_reduce.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/context.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/cross_entropy_layer.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/cublas_wrappers.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/cuda_util.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/dropout.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/feed_forward.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/kernels.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/ls_cub.cuh -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/normalize_layer.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/softmax.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include copying colossalai/kernel/cuda_native/csrc/kernels/include/strided_batch_gemm.h -> build/lib.linux-x86_64-3.7/colossalai/kernel/cuda_native/csrc/kernels/include running build_ext error: [Errno 2] No such file or directory: 'which': 'which' [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure

Γ— Encountered error while trying to install package. ╰─> colossalai

note: This is an issue with the package mentioned above, not pip. hint: See above for output from the failure.` When I run "CUDA_EXT=1 pip install ." I get above error message

Environment

NVIDIA-SMI 515.86.01 Driver Version: 515.86.01 CUDA Version: 11.7

scarydemon2 avatar Feb 24 '23 02:02 scarydemon2

Bot detected the issue body's language is not English, translate it automatically. πŸ‘―πŸ‘­πŸ»πŸ§‘β€πŸ€β€πŸ§‘πŸ‘«πŸ§‘πŸΏβ€πŸ€β€πŸ§‘πŸ»πŸ‘©πŸΎβ€πŸ€β€πŸ‘¨πŸΏπŸ‘¬πŸΏ


Title: [BUG]:

Issues-translate-bot avatar Feb 24 '23 02:02 Issues-translate-bot

Hi, PyTorch might not be compatible with your cuda 11.7. (source)

Can you please downgrade it or change to another environment?

JThh avatar Feb 28 '23 09:02 JThh

Hi, it seems that Python cannot infer your home directory correctly. Can you provide the output for this:

import os
os.path.expanduser('~')

FrankLeeeee avatar Feb 28 '23 09:02 FrankLeeeee

We have updated a lot. This issue was closed due to inactivity. Thanks.

binmakeswell avatar Apr 20 '23 09:04 binmakeswell