ms-swift icon indicating copy to clipboard operation
ms-swift copied to clipboard

transformer_engine 安装失败

Open zhangtianhong-1998 opened this issue 7 months ago • 7 comments

cuda 12.4 python 3.10 torch 2.6.0 参考了两个关闭issues仍未解决 pip install 'ms-swift' pip install pybind11

注1 使用了gitee复制了仓库绕过网络限制

SITE_PACKAGES=$(python -c "import site; print(site.getsitepackages()[0])") && echo $SITE_PACKAGES &&
CUDNN_PATH=$SITE_PACKAGES/nvidia/cudnn CPLUS_INCLUDE_PATH=$SITE_PACKAGES/nvidia/cudnn/include
pip install git+https://gitee.com/zhangtianhonggitee/TransformerEngine.git@stable

注2 pip install 'ms-swift[all]' -U

和源码完整版本pip install -e '.[all]' 会出现解包错误,所以只安装了pip install 'ms-swift'

Collecting binpacking (from ms-swift[all]) Using cached binpacking-1.5.2-py3-none-any.whl Using cached https://pypi.tuna.tsinghua.edu.cn/packages/fa/1c/d85aa7b120c09615c6d0f791fe581d42eb1fb062478fdc25a4e95dc88113/binpacking-1.5.1.tar.gz (9.4 kB) Preparing metadata (setup.py) ... done Using cached https://pypi.tuna.tsinghua.edu.cn/packages/83/08/5fb79fafc4c857d6712a24250b1fdba6aa3821b9492ccc239a05bf6ccfbf/binpacking-1.5.0.tar.gz (9.4 kB) Preparing metadata (setup.py) ... done Using cached https://pypi.tuna.tsinghua.edu.cn/packages/e8/74/fd61be713a1bfe72a7394bc4fe9cb5fc70d0aaf4a4b49a2e8152eed67a59/binpacking-1.4.5.tar.gz (8.9 kB) Preparing metadata (setup.py) ... done Using cached https://pypi.tuna.tsinghua.edu.cn/packages/53/b3/2796bc69236c624e46ba02b4e11c3c8d66193ce2124a03c11db190176bfe/binpacking-1.4.3.tar.gz (7.6 kB) Preparing metadata (setup.py) ... done Using cached https://pypi.tuna.tsinghua.edu.cn/packages/e6/de/5e565925472c7f9a987525cb6b49ac32a228fe203cd76c207d041683d40c/binpacking-1.4.2.tar.gz (7.6 kB) Preparing metadata (setup.py) ... done Using cached https://pypi.tuna.tsinghua.edu.cn/packages/dc/97/7e632f6dcd46c806160211d1e9a5cda1641cbb1a74fb5967024c5aa52ed5/binpacking-1.4.1.tar.gz (7.6 kB) Preparing metadata (setup.py) ... done Using cached https://pypi.tuna.tsinghua.edu.cn/packages/7a/9a/c336fe2f0546f17d945e6f9f6bc06b8b306d10750b20ec6e12715c32f7f8/binpacking-1.4.tar.gz (5.8 kB) Preparing metadata (setup.py) ... done Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c9/fe/56782753922a195d332d419949f889c1d59cab7b1780db2351bd8b99501c/binpacking-1.3.tar.gz (5.6 kB) Preparing metadata (setup.py) ... done Using cached https://pypi.tuna.tsinghua.edu.cn/packages/9b/e4/a7ee63c0f201c5edb5817e36f964c571112fc00b23e8887bee4b41ac97f4/binpacking-1.2.tar.gz (5.4 kB) Preparing metadata (setup.py) ... done Using cached https://pypi.tuna.tsinghua.edu.cn/packages/51/d6/a26db6fd38fba493c3bfbd51e91b14a985bcc08dcf2900a9fd850f3b8507/binpacking-1.1.tar.gz (5.4 kB) Preparing metadata (setup.py) ... done Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d0/eb/7a7e6f4be7376260e97879cf51f1e3b9ff614f31e97355b3e26a587a2535/binpacking-1.0.tar.gz (5.1 kB) Preparing metadata (setup.py) ... done Collecting attrdict (from ms-swift[all]) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/ef/97/28fe7e68bc7adfce67d4339756e85e9fcf3c6fd7f0c0781695352b70472c/attrdict-2.0.1-py2.py3-none-any.whl (9.9 kB) error: resolution-too-deep

× Dependency resolution exceeded maximum depth ╰─> Pip cannot resolve the current dependencies as the dependency graph is too complex for pip to solve efficiently.

hint: Try adding lower bounds to constrain your dependencies, for example: 'package>=2.0.0' instead of just 'package'.

stable版本

使用指令

SITE_PACKAGES=$(python -c "import site; print(site.getsitepackages()[0])") && echo $SITE_PACKAGES &&
CUDNN_PATH=$SITE_PACKAGES/nvidia/cudnn CPLUS_INCLUDE_PATH=$SITE_PACKAGES/nvidia/cudnn/include
pip install git+https://gitee.com/zhangtianhonggitee/TransformerEngine.git@stable

              instantiation of "void transformer_engine::gated_kernels::quantize_gated<IS_DGATED,ParamOP,ActOP,DActOP>(const transformer_engine::Tensor &, const transformer_engine::Tensor &, transformer_engine::Tensor *, cudaStream_t) [with IS_DGATED=false, ParamOP=transformer_engine::Empty, ActOP=transformer_engine::relu, DActOP=(float (*)(float, const transformer_engine::Empty &))nullptr]" at line 1073
              instantiation of "void transformer_engine::detail::quantize_gated_helper<IS_DGATED,ParamOP,ActOP,DActOP>(NVTETensor, NVTETensor, NVTETensor, cudaStream_t) [with IS_DGATED=false, ParamOP=transformer_engine::Empty, ActOP=transformer_engine::relu, DActOP=(float (*)(float, const transformer_engine::Empty &))nullptr]" at line 59 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./activation_template.h
              instantiation of "void transformer_engine::gated_act_fn<ComputeType,Param,ActOP>(NVTETensor, NVTETensor, cudaStream_t) [with ComputeType=transformer_engine::fp32, Param=transformer_engine::Empty, ActOP=transformer_engine::relu]" at line 26 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/relu.cu

  [41/43] /usr/local/cuda-12.4/bin/nvcc -forward-unknown-to-host-compiler -DNV_CUDNN_FRONTEND_USE_DYNAMIC_LOADING -Dtransformer_engine_EXPORTS -I/tmp/pip-req-build-arnq_5jt/transformer_engine/common/.. -I/tmp/pip-req-build-arnq_5jt/transformer_engine/common/include -I/usr/local/cuda-12.4/targets/x86_64-linux/include -I/tmp/pip-req-build-arnq_5jt/transformer_engine/common/../../3rdparty/cudnn-frontend/include -I/tmp/pip-req-build-arnq_5jt/build/cmake/string_headers -isystem=/usr/local/cuda-12.4/include -Wl,--version-script=/tmp/pip-req-build-arnq_5jt/transformer_engine/common/libtransformer_engine.version --expt-relaxed-constexpr -O3 --threads 1 -O3 -DNDEBUG --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_80,code=[compute_80,sm_80] --generate-code=arch=compute_89,code=[compute_89,sm_89] --generate-code=arch=compute_90,code=[compute_90,sm_90] -Xcompiler=-fPIC -std=c++17 -MD -MT CMakeFiles/transformer_engine.dir/activation/gelu.cu.o -MF CMakeFiles/transformer_engine.dir/activation/gelu.cu.o.d -x cu -c /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/gelu.cu -o CMakeFiles/transformer_engine.dir/activation/gelu.cu.o
  /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./../util/cast_kernels.cuh(930): warning #177-D: variable "input_shape" was declared but never referenced
      const auto &input_shape = input.data.shape;
                  ^
            detected during:
              instantiation of "void transformer_engine::fp8_quantize_arch_ge_100<IS_DBIAS,IS_DACT,IS_ACT,ParamOP,OP>(const transformer_engine::Tensor &, const transformer_engine::Tensor *, const transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, cudaStream_t) [with IS_DBIAS=false, IS_DACT=false, IS_ACT=true, ParamOP=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 1209
              instantiation of "void transformer_engine::fp8_quantize<IS_DBIAS,IS_DACT,IS_ACT,ParamOP,OP>(const transformer_engine::Tensor &, const transformer_engine::Tensor *, const transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, cudaStream_t) [with IS_DBIAS=false, IS_DACT=false, IS_ACT=true, ParamOP=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 1255
              instantiation of "void transformer_engine::detail::quantize_helper<IS_DBIAS,IS_DACT,IS_ACT,ParamOP,OP>(NVTETensor, NVTETensor, NVTETensor, NVTETensor, NVTETensor, NVTETensor, cudaStream_t) [with IS_DBIAS=false, IS_DACT=false, IS_ACT=true, ParamOP=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 36 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./activation_template.h
              instantiation of "void transformer_engine::act_fn<ComputeType,Param,OP>(NVTETensor, NVTETensor, cudaStream_t) [with ComputeType=transformer_engine::fp32, Param=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 13 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/gelu.cu

  Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

  /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./../util/cast_gated_kernels.cuh(829): warning #177-D: variable "amax_ptr" was declared but never referenced
      float *const amax_ptr = reinterpret_cast<float *>(output->amax.dptr);
                   ^
            detected during:
              instantiation of "void transformer_engine::gated_kernels::quantize_gated<IS_DGATED,ParamOP,ActOP,DActOP>(const transformer_engine::Tensor &, const transformer_engine::Tensor &, transformer_engine::Tensor *, cudaStream_t) [with IS_DGATED=false, ParamOP=transformer_engine::Empty, ActOP=transformer_engine::gelu, DActOP=(float (*)(float, const transformer_engine::Empty &))nullptr]" at line 1073
              instantiation of "void transformer_engine::detail::quantize_gated_helper<IS_DGATED,ParamOP,ActOP,DActOP>(NVTETensor, NVTETensor, NVTETensor, cudaStream_t) [with IS_DGATED=false, ParamOP=transformer_engine::Empty, ActOP=transformer_engine::gelu, DActOP=(float (*)(float, const transformer_engine::Empty &))nullptr]" at line 59 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./activation_template.h
              instantiation of "void transformer_engine::gated_act_fn<ComputeType,Param,ActOP>(NVTETensor, NVTETensor, cudaStream_t) [with ComputeType=transformer_engine::fp32, Param=transformer_engine::Empty, ActOP=transformer_engine::gelu]" at line 26 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/gelu.cu

  /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./../util/cast_kernels.cuh(930): warning #177-D: variable "input_shape" was declared but never referenced
      const auto &input_shape = input.data.shape;
                  ^
            detected during:
              instantiation of "void transformer_engine::fp8_quantize_arch_ge_100<IS_DBIAS,IS_DACT,IS_ACT,ParamOP,OP>(const transformer_engine::Tensor &, const transformer_engine::Tensor *, const transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, cudaStream_t) [with IS_DBIAS=false, IS_DACT=false, IS_ACT=true, ParamOP=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 1209
              instantiation of "void transformer_engine::fp8_quantize<IS_DBIAS,IS_DACT,IS_ACT,ParamOP,OP>(const transformer_engine::Tensor &, const transformer_engine::Tensor *, const transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, cudaStream_t) [with IS_DBIAS=false, IS_DACT=false, IS_ACT=true, ParamOP=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 1255
              instantiation of "void transformer_engine::detail::quantize_helper<IS_DBIAS,IS_DACT,IS_ACT,ParamOP,OP>(NVTETensor, NVTETensor, NVTETensor, NVTETensor, NVTETensor, NVTETensor, cudaStream_t) [with IS_DBIAS=false, IS_DACT=false, IS_ACT=true, ParamOP=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 36 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./activation_template.h
              instantiation of "void transformer_engine::act_fn<ComputeType,Param,OP>(NVTETensor, NVTETensor, cudaStream_t) [with ComputeType=transformer_engine::fp32, Param=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 13 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/gelu.cu

  Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

  /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./../util/cast_gated_kernels.cuh(829): warning #177-D: variable "amax_ptr" was declared but never referenced
      float *const amax_ptr = reinterpret_cast<float *>(output->amax.dptr);
                   ^
            detected during:
              instantiation of "void transformer_engine::gated_kernels::quantize_gated<IS_DGATED,ParamOP,ActOP,DActOP>(const transformer_engine::Tensor &, const transformer_engine::Tensor &, transformer_engine::Tensor *, cudaStream_t) [with IS_DGATED=false, ParamOP=transformer_engine::Empty, ActOP=transformer_engine::gelu, DActOP=(float (*)(float, const transformer_engine::Empty &))nullptr]" at line 1073
              instantiation of "void transformer_engine::detail::quantize_gated_helper<IS_DGATED,ParamOP,ActOP,DActOP>(NVTETensor, NVTETensor, NVTETensor, cudaStream_t) [with IS_DGATED=false, ParamOP=transformer_engine::Empty, ActOP=transformer_engine::gelu, DActOP=(float (*)(float, const transformer_engine::Empty &))nullptr]" at line 59 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./activation_template.h
              instantiation of "void transformer_engine::gated_act_fn<ComputeType,Param,ActOP>(NVTETensor, NVTETensor, cudaStream_t) [with ComputeType=transformer_engine::fp32, Param=transformer_engine::Empty, ActOP=transformer_engine::gelu]" at line 26 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/gelu.cu

  /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./../util/cast_kernels.cuh(930): warning #177-D: variable "input_shape" was declared but never referenced
      const auto &input_shape = input.data.shape;
                  ^
            detected during:
              instantiation of "void transformer_engine::fp8_quantize_arch_ge_100<IS_DBIAS,IS_DACT,IS_ACT,ParamOP,OP>(const transformer_engine::Tensor &, const transformer_engine::Tensor *, const transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, cudaStream_t) [with IS_DBIAS=false, IS_DACT=false, IS_ACT=true, ParamOP=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 1209
              instantiation of "void transformer_engine::fp8_quantize<IS_DBIAS,IS_DACT,IS_ACT,ParamOP,OP>(const transformer_engine::Tensor &, const transformer_engine::Tensor *, const transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, cudaStream_t) [with IS_DBIAS=false, IS_DACT=false, IS_ACT=true, ParamOP=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 1255
              instantiation of "void transformer_engine::detail::quantize_helper<IS_DBIAS,IS_DACT,IS_ACT,ParamOP,OP>(NVTETensor, NVTETensor, NVTETensor, NVTETensor, NVTETensor, NVTETensor, cudaStream_t) [with IS_DBIAS=false, IS_DACT=false, IS_ACT=true, ParamOP=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 36 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./activation_template.h
              instantiation of "void transformer_engine::act_fn<ComputeType,Param,OP>(NVTETensor, NVTETensor, cudaStream_t) [with ComputeType=transformer_engine::fp32, Param=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 13 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/gelu.cu

  Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

  /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./../util/cast_gated_kernels.cuh(829): warning #177-D: variable "amax_ptr" was declared but never referenced
      float *const amax_ptr = reinterpret_cast<float *>(output->amax.dptr);
                   ^
            detected during:
              instantiation of "void transformer_engine::gated_kernels::quantize_gated<IS_DGATED,ParamOP,ActOP,DActOP>(const transformer_engine::Tensor &, const transformer_engine::Tensor &, transformer_engine::Tensor *, cudaStream_t) [with IS_DGATED=false, ParamOP=transformer_engine::Empty, ActOP=transformer_engine::gelu, DActOP=(float (*)(float, const transformer_engine::Empty &))nullptr]" at line 1073
              instantiation of "void transformer_engine::detail::quantize_gated_helper<IS_DGATED,ParamOP,ActOP,DActOP>(NVTETensor, NVTETensor, NVTETensor, cudaStream_t) [with IS_DGATED=false, ParamOP=transformer_engine::Empty, ActOP=transformer_engine::gelu, DActOP=(float (*)(float, const transformer_engine::Empty &))nullptr]" at line 59 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./activation_template.h
              instantiation of "void transformer_engine::gated_act_fn<ComputeType,Param,ActOP>(NVTETensor, NVTETensor, cudaStream_t) [with ComputeType=transformer_engine::fp32, Param=transformer_engine::Empty, ActOP=transformer_engine::gelu]" at line 26 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/gelu.cu

  /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./../util/cast_kernels.cuh(930): warning #177-D: variable "input_shape" was declared but never referenced
      const auto &input_shape = input.data.shape;
                  ^
            detected during:
              instantiation of "void transformer_engine::fp8_quantize_arch_ge_100<IS_DBIAS,IS_DACT,IS_ACT,ParamOP,OP>(const transformer_engine::Tensor &, const transformer_engine::Tensor *, const transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, cudaStream_t) [with IS_DBIAS=false, IS_DACT=false, IS_ACT=true, ParamOP=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 1209
              instantiation of "void transformer_engine::fp8_quantize<IS_DBIAS,IS_DACT,IS_ACT,ParamOP,OP>(const transformer_engine::Tensor &, const transformer_engine::Tensor *, const transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, transformer_engine::Tensor *, cudaStream_t) [with IS_DBIAS=false, IS_DACT=false, IS_ACT=true, ParamOP=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 1255
              instantiation of "void transformer_engine::detail::quantize_helper<IS_DBIAS,IS_DACT,IS_ACT,ParamOP,OP>(NVTETensor, NVTETensor, NVTETensor, NVTETensor, NVTETensor, NVTETensor, cudaStream_t) [with IS_DBIAS=false, IS_DACT=false, IS_ACT=true, ParamOP=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 36 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./activation_template.h
              instantiation of "void transformer_engine::act_fn<ComputeType,Param,OP>(NVTETensor, NVTETensor, cudaStream_t) [with ComputeType=transformer_engine::fp32, Param=transformer_engine::Empty, OP=transformer_engine::gelu]" at line 13 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/gelu.cu

  Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

  /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./../util/cast_gated_kernels.cuh(829): warning #177-D: variable "amax_ptr" was declared but never referenced
      float *const amax_ptr = reinterpret_cast<float *>(output->amax.dptr);
                   ^
            detected during:
              instantiation of "void transformer_engine::gated_kernels::quantize_gated<IS_DGATED,ParamOP,ActOP,DActOP>(const transformer_engine::Tensor &, const transformer_engine::Tensor &, transformer_engine::Tensor *, cudaStream_t) [with IS_DGATED=false, ParamOP=transformer_engine::Empty, ActOP=transformer_engine::gelu, DActOP=(float (*)(float, const transformer_engine::Empty &))nullptr]" at line 1073
              instantiation of "void transformer_engine::detail::quantize_gated_helper<IS_DGATED,ParamOP,ActOP,DActOP>(NVTETensor, NVTETensor, NVTETensor, cudaStream_t) [with IS_DGATED=false, ParamOP=transformer_engine::Empty, ActOP=transformer_engine::gelu, DActOP=(float (*)(float, const transformer_engine::Empty &))nullptr]" at line 59 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/./activation_template.h
              instantiation of "void transformer_engine::gated_act_fn<ComputeType,Param,ActOP>(NVTETensor, NVTETensor, cudaStream_t) [with ComputeType=transformer_engine::fp32, Param=transformer_engine::Empty, ActOP=transformer_engine::gelu]" at line 26 of /tmp/pip-req-build-arnq_5jt/transformer_engine/common/activation/gelu.cu

  [42/43] /usr/local/cuda-12.4/bin/nvcc -forward-unknown-to-host-compiler -DNV_CUDNN_FRONTEND_USE_DYNAMIC_LOADING -Dtransformer_engine_EXPORTS -I/tmp/pip-req-build-arnq_5jt/transformer_engine/common/.. -I/tmp/pip-req-build-arnq_5jt/transformer_engine/common/include -I/usr/local/cuda-12.4/targets/x86_64-linux/include -I/tmp/pip-req-build-arnq_5jt/transformer_engine/common/../../3rdparty/cudnn-frontend/include -I/tmp/pip-req-build-arnq_5jt/build/cmake/string_headers -isystem=/usr/local/cuda-12.4/include -Wl,--version-script=/tmp/pip-req-build-arnq_5jt/transformer_engine/common/libtransformer_engine.version --expt-relaxed-constexpr -O3 --threads 1 -O3 -DNDEBUG --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_80,code=[compute_80,sm_80] --generate-code=arch=compute_89,code=[compute_89,sm_89] --generate-code=arch=compute_90,code=[compute_90,sm_90] -Xcompiler=-fPIC -std=c++17 -MD -MT CMakeFiles/transformer_engine.dir/transpose/cast_transpose_fusion.cu.o -MF CMakeFiles/transformer_engine.dir/transpose/cast_transpose_fusion.cu.o.d -x cu -c /tmp/pip-req-build-arnq_5jt/transformer_engine/common/transpose/cast_transpose_fusion.cu -o CMakeFiles/transformer_engine.dir/transpose/cast_transpose_fusion.cu.o
  ninja: build stopped: subcommand failed.
  Traceback (most recent call last):
    File "/tmp/pip-req-build-arnq_5jt/build_tools/build_ext.py", line 89, in _build_cmake
      subprocess.run(command, cwd=build_dir, check=True)
    File "/root/anaconda3/envs/ms/lib/python3.10/subprocess.py", line 526, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['/usr/bin/cmake', '--build', '/tmp/pip-req-build-arnq_5jt/build/cmake', '--verbose', '--parallel']' returned non-zero exit status 1.

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 35, in <module>
    File "/tmp/pip-req-build-arnq_5jt/setup.py", line 179, in <module>
      setuptools.setup(
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/__init__.py", line 104, in setup
      return distutils.core.setup(**attrs)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 184, in setup
      return run_commands(dist)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
      dist.run_commands()
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/tmp/pip-req-build-arnq_5jt/setup.py", line 53, in run
      super().run()
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/wheel/_bdist_wheel.py", line 387, in run
      self.run_command("build")
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132, in run
      self.run_command(cmd_name)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/tmp/pip-req-build-arnq_5jt/build_tools/build_ext.py", line 119, in run
      ext._build_cmake(
    File "/tmp/pip-req-build-arnq_5jt/build_tools/build_ext.py", line 91, in _build_cmake
      raise RuntimeError(f"Error when running CMake: {e}")
  RuntimeError: Error when running CMake: Command '['/usr/bin/cmake', '--build', '/tmp/pip-req-build-arnq_5jt/build/cmake', '--verbose', '--parallel']' returned non-zero exit status 1.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for transformer_engine Running setup.py clean for transformer_engine Failed to build transformer_engine ERROR: Failed to build installable wheels for some pyproject.toml based projects (transformer_engine)

最新版本

使用指令

SITE_PACKAGES=$(python -c "import site; print(site.getsitepackages()[0])") && echo $SITE_PACKAGES &&
CUDNN_PATH=$SITE_PACKAGES/nvidia/cudnn CPLUS_INCLUDE_PATH=$SITE_PACKAGES/nvidia/cudnn/include
pip install git+https://gitee.com/zhangtianhonggitee/TransformerEngine.git

  [44/45] /usr/local/cuda-12.4/bin/nvcc -forward-unknown-to-host-compiler -DNV_CUDNN_FRONTEND_USE_DYNAMIC_LOADING -Dtransformer_engine_EXPORTS -I/tmp/pip-req-build-216bf86l/transformer_engine/common/.. -I/tmp/pip-req-build-216bf86l/transformer_engine/common/include -I/usr/local/cuda-12.4/targets/x86_64-linux/include -I/tmp/pip-req-build-216bf86l/transformer_engine/common/../../3rdparty/cudnn-frontend/include -I/tmp/pip-req-build-216bf86l/build/cmake/string_headers -isystem=/usr/local/cuda-12.4/include -Wl,--version-script=/tmp/pip-req-build-216bf86l/transformer_engine/common/libtransformer_engine.version --expt-relaxed-constexpr -O3 --threads 1 -O3 -DNDEBUG --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_80,code=[compute_80,sm_80] --generate-code=arch=compute_89,code=[compute_89,sm_89] --generate-code=arch=compute_90,code=[compute_90,sm_90] -Xcompiler=-fPIC -std=c++17 -MD -MT CMakeFiles/transformer_engine.dir/transpose/cast_transpose_fusion.cu.o -MF CMakeFiles/transformer_engine.dir/transpose/cast_transpose_fusion.cu.o.d -x cu -c /tmp/pip-req-build-216bf86l/transformer_engine/common/transpose/cast_transpose_fusion.cu -o CMakeFiles/transformer_engine.dir/transpose/cast_transpose_fusion.cu.o
  ninja: build stopped: subcommand failed.
  Traceback (most recent call last):
    File "/tmp/pip-req-build-216bf86l/build_tools/build_ext.py", line 88, in _build_cmake
      subprocess.run(command, cwd=build_dir, check=True)
    File "/root/anaconda3/envs/ms/lib/python3.10/subprocess.py", line 526, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['/usr/bin/cmake', '--build', '/tmp/pip-req-build-216bf86l/build/cmake', '--verbose', '--parallel']' returned non-zero exit status 1.

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 35, in <module>
    File "/tmp/pip-req-build-216bf86l/setup.py", line 187, in <module>
      setuptools.setup(
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/__init__.py", line 104, in setup
      return distutils.core.setup(**attrs)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 184, in setup
      return run_commands(dist)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
      dist.run_commands()
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/tmp/pip-req-build-216bf86l/setup.py", line 51, in run
      super().run()
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/wheel/_bdist_wheel.py", line 387, in run
      self.run_command("build")
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132, in run
      self.run_command(cmd_name)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/root/anaconda3/envs/ms/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/tmp/pip-req-build-216bf86l/build_tools/build_ext.py", line 120, in run
      ext._build_cmake(
    File "/tmp/pip-req-build-216bf86l/build_tools/build_ext.py", line 90, in _build_cmake
      raise RuntimeError(f"Error when running CMake: {e}")
  RuntimeError: Error when running CMake: Command '['/usr/bin/cmake', '--build', '/tmp/pip-req-build-216bf86l/build/cmake', '--verbose', '--parallel']' returned non-zero exit status 1.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for transformer_engine Running setup.py clean for transformer_engine Building wheel for nvdlfw-inspect (pyproject.toml) ... done Created wheel for nvdlfw-inspect: filename=nvdlfw_inspect-0.1.0-py3-none-any.whl size=30813 sha256=e151bc54367e558b8ecd48e00b6fe23645dd5a18be9c4bea0af5101809f4ee62 Stored in directory: /tmp/pip-ephem-wheel-cache-9m88qhu1/wheels/6f/b1/55/1a653c8ad54c41e4081205176009cc4cfc7f06ffc781fa6d0a Successfully built nvdlfw-inspect Failed to build transformer_engine ERROR: Failed to build installable wheels for some pyproject.toml based projects (transformer_engine)

zhangtianhong-1998 avatar Apr 30 '25 06:04 zhangtianhong-1998

建议使用镜像,这个包确实不太好安装

Jintao-Huang avatar Apr 30 '25 07:04 Jintao-Huang

建议使用镜像,这个包确实不太好安装

有点心酸,谢谢

zhangtianhong-1998 avatar Apr 30 '25 07:04 zhangtianhong-1998

建议使用镜像,这个包确实不太好安装

有个问题,目前镜像0.8.3好像不支持qwen3,直接升级吗 deepseed我看没有参数配置,zero3的参数和优化参数是直接默认卸载到Cpu吗

zhangtianhong-1998 avatar Apr 30 '25 08:04 zhangtianhong-1998

直接升级swift就可以了

Jintao-Huang avatar May 01 '25 10:05 Jintao-Huang

https://github.com/modelscope/ms-swift/tree/main/swift/llm/ds_config

Jintao-Huang avatar May 01 '25 10:05 Jintao-Huang

@zhangtianhong-1998 我也安装失败了,请问这里的镜像指的是

zhangansen avatar May 03 '25 05:05 zhangansen

@zhangtianhong-1998 我也安装失败了,请问这里的镜像指的是

docker镜像

zhangtianhong-1998 avatar May 03 '25 05:05 zhangtianhong-1998