triton icon indicating copy to clipboard operation
triton copied to clipboard

Build fails for Grace Hopper system

Open wahabk opened this issue 1 year ago • 6 comments

Hello, I'm using the main branch and trying to install on an Arm Grace Hopper system.

My guess is the error is that this is trying to install amd and proton on an Nvidia system.

  1. Do you have any insight to the error?
  2. Is there a method to select only building the Nvidia backend?

I am happy to commit a PR if you have an idea of how to fix this. Perhaps with pip config options with an environment variable.

Error

[212/228] /usr/bin/c++ -DTRITON_BACKENDS_TUPLE="(nvidia,amd)" -Dtriton_EXPORTS -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10 -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/. -I/home/benchmarking/akawafi1.benchmarking/.triton/llvm/llvm-56152fa3-ubuntu-arm64/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/third_party -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/src -I/home/benchmarking/akawafi1.benchmarking/miniforge3/envs/triton-env/include/python3.10 -I/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/pybind11/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party/amd/lib/TritonAMDGPUTransforms/../../include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/third_party/amd/lib/TritonAMDGPUTransforms/../../include -D__STDC_FORMAT_MACROS  -fPIC -std=gnu++17 -Werror -Wno-covered-switch-default -fvisibility=hidden -O2 -g -std=gnu++1z -fPIC -MD -MT CMakeFiles/triton.dir/python/src/ir.cc.o -MF CMakeFiles/triton.dir/python/src/ir.cc.o.d -o CMakeFiles/triton.dir/python/src/ir.cc.o -c /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/src/ir.cc
  ninja: build stopped: subcommand failed.

  Traceback (most recent call last):
    File "/home/benchmarking/akawafi1.benchmarking/.local/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
      main()
    File "/home/benchmarking/akawafi1.benchmarking/.local/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
    File "/home/benchmarking/akawafi1.benchmarking/.local/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
      return _build_backend().build_wheel(wheel_directory, config_settings,
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 421, in build_wheel
      return self._build_with_temp_dir(
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 403, in _build_with_temp_dir
      self.run_setup()
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 503, in run_setup
      super().run_setup(setup_script=setup_script)
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 318, in run_setup
      exec(code, locals())
    File "<string>", line 624, in <module>
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/__init__.py", line 117, in setup
      return distutils.core.setup(**attrs)
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 184, in setup
      return run_commands(dist)
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
      dist.run_commands()
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 954, in run_commands
      self.run_command(cmd)
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 950, in run_command
      super().run_command(command)
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
      cmd_obj.run()
    File "<string>", line 578, in run
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/wheel/_bdist_wheel.py", line 378, in run
      self.run_command("build")
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 950, in run_command
      super().run_command(command)
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
      cmd_obj.run()
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 135, in run
      self.run_command(cmd_name)
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 950, in run_command
      super().run_command(command)
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
      cmd_obj.run()
    File "<string>", line 324, in run
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 950, in run_command
      super().run_command(command)
    File "/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
      cmd_obj.run()
    File "<string>", line 361, in run
    File "<string>", line 469, in build_extension
    File "/home/benchmarking/akawafi1.benchmarking/miniforge3/envs/triton-env/lib/python3.10/subprocess.py", line 369, in check_call
      raise CalledProcessError(retcode, cmd)
  subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--config', 'TritonRelBuildWithAsserts', '-j576']' returned non-zero exit status 1.
  error: subprocess-exited-with-error
  
  × Building wheel for triton (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /home/benchmarking/akawafi1.benchmarking/miniforge3/envs/triton-env/bin/python3.10 /home/benchmarking/akawafi1.benchmarking/.local/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /local/user/1483800084/tmpfoiq7e3c
  cwd: /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python
  Building wheel for triton (pyproject.toml): finished with status 'error'
  ERROR: Failed building wheel for triton
Failed to build triton
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (triton)

There is also this error that shows up in an earlier step.

[102/228] /usr/bin/c++  -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/third_party/amd/unittest/Conversion -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party/amd/unittest/Conversion -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/. -I/home/benchmarking/akawafi1.benchmarking/.triton/llvm/llvm-56152fa3-ubuntu-arm64/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/third_party -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/src -I/home/benchmarking/akawafi1.benchmarking/miniforge3/envs/triton-env/include/python3.10 -I/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/pybind11/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party/amd/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/third_party/amd/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party/amd/lib/TritonAMDGPUTransforms/../../include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/third_party/amd/lib/TritonAMDGPUTransforms/../../include -isystem /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/_deps/googletest-src/googletest/include -isystem /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/_deps/googletest-src/googletest -isystem /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/_deps/googletest-src/googlemock/include -isystem /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/_deps/googletest-src/googlemock -D__STDC_FORMAT_MACROS  -fPIC -std=gnu++17 -Werror -Wno-covered-switch-default -fvisibility=hidden -O2 -g -std=gnu++1z -fno-rtti -MD -MT third_party/amd/unittest/Conversion/CMakeFiles/TestOptimizeLDS.dir/OptimizeLDSTest.cpp.o -MF third_party/amd/unittest/Conversion/CMakeFiles/TestOptimizeLDS.dir/OptimizeLDSTest.cpp.o.d -o third_party/amd/unittest/Conversion/CMakeFiles/TestOptimizeLDS.dir/OptimizeLDSTest.cpp.o -c /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party/amd/unittest/Conversion/OptimizeLDSTest.cpp
  FAILED: third_party/amd/unittest/Conversion/CMakeFiles/TestOptimizeLDS.dir/OptimizeLDSTest.cpp.o
  /usr/bin/c++  -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/third_party/amd/unittest/Conversion -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party/amd/unittest/Conversion -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/. -I/home/benchmarking/akawafi1.benchmarking/.triton/llvm/llvm-56152fa3-ubuntu-arm64/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/third_party -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/src -I/home/benchmarking/akawafi1.benchmarking/miniforge3/envs/triton-env/include/python3.10 -I/local/user/1483800084/pip-build-env-6tulqnyc/overlay/lib/python3.10/site-packages/pybind11/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party/amd/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/third_party/amd/include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party/amd/lib/TritonAMDGPUTransforms/../../include -I/lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/third_party/amd/lib/TritonAMDGPUTransforms/../../include -isystem /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/_deps/googletest-src/googletest/include -isystem /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/_deps/googletest-src/googletest -isystem /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/_deps/googletest-src/googlemock/include -isystem /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/python/build/cmake.linux-aarch64-cpython-3.10/_deps/googletest-src/googlemock -D__STDC_FORMAT_MACROS  -fPIC -std=gnu++17 -Werror -Wno-covered-switch-default -fvisibility=hidden -O2 -g -std=gnu++1z -fno-rtti -MD -MT third_party/amd/unittest/Conversion/CMakeFiles/TestOptimizeLDS.dir/OptimizeLDSTest.cpp.o -MF third_party/amd/unittest/Conversion/CMakeFiles/TestOptimizeLDS.dir/OptimizeLDSTest.cpp.o.d -o third_party/amd/unittest/Conversion/CMakeFiles/TestOptimizeLDS.dir/OptimizeLDSTest.cpp.o -c /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party/amd/unittest/Conversion/OptimizeLDSTest.cpp
  /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party/amd/unittest/Conversion/OptimizeLDSTest.cpp: In function ‘bool mlir::checkProdEq(llvm::ArrayRef<unsigned int>)’:
  /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party/amd/unittest/Conversion/OptimizeLDSTest.cpp:11:12: error: ‘reduce’ is not a member of ‘std’
         std::reduce(a.begin(), a.end(), 1u, std::multiplies<unsigned>());
              ^~~~~~
  /lus/lfs1aip1/home/benchmarking/akawafi1.benchmarking/code/IsambardMLOps/scripts/build_triton/triton/third_party/amd/unittest/Conversion/OptimizeLDSTest.cpp:11:12: note: suggested alternative: ‘replace’
         std::reduce(a.begin(), a.end(), 1u, std::multiplies<unsigned>());
              ^~~~~~
              replace
  At global scope:
  cc1plus: error: unrecognized command line option ‘-Wno-covered-switch-default’ [-Werror]
  cc1plus: all warnings being treated as errors

To Reproduce

conda create -n "triton-env" python=3.10 -y
conda activate triton-env
git clone https://github.com/triton-lang/triton.git
cd triton
pip install -v python/ >& ../pip_output.log

System Specification

$ uname -a
Linux nid001040 5.14.21-150500.55.31_13.0.53-cray_shasta_c_64k #1 SMP Mon Dec 4 22:56:47 UTC 2023 (03d3f83) aarch64 aarch64 aarch64 GNU/Linux

wahabk avatar Sep 04 '24 14:09 wahabk

I'm not able to reproduce. Might have to do with your local environment.

Jokeren avatar Sep 04 '24 14:09 Jokeren

Thanks for looking @Jokeren, Do you have any AMD dependencies installed? How can I edit -DTRITON_BACKENDS_TUPLE="(nvidia,amd)" and not install HIP?

Let me know if there is a method for me to debug my local environment, i.e. is it my nvcc version?

wahabk avatar Sep 04 '24 15:09 wahabk

So the tag v0.4 works. Every other version since has failed for me. I've tested:

  • v0.4
  • v1.0
  • v1.1
  • v1.1.1
  • v1.1.2
  • v2.0
  • v2.1

wahabk avatar Sep 04 '24 15:09 wahabk

std::reduce requires C++17. Perhaps your toolchain is too old or is being called with the wrong flags?

peterbell10 avatar Sep 04 '24 17:09 peterbell10

Hello @peterbell10, thanks for your reply. Using a more modern G++ helped me get further in the build process. However, I was not able to continue the build since there is a hard coded dependency on ubuntu-18.04-x86_64 LLVM if you're not using apple-darwin.

This forked build by @acollins3 worked for me on GH200 and SLES linux:

pip install https://github.com/acollins3/triton/releases/download/triton-2.1.0-arm64/triton-2.1.0-cp310-cp310-linux_aarch64.whl

I'm currently going through a manual conda-build from the condo-forge feedstock, which is full of build patches. This also doesn't support aarch64 on linux.

Is there any plan to support building wheels for linux aarch64? As support seems to be pretty loose and depends on forks and patches at the moment.

wahabk avatar Sep 06 '24 08:09 wahabk

As I've mentioned before, linux aarch64 worked well for us

Jokeren avatar Sep 06 '24 11:09 Jokeren

I think the issue is that is using /usr/bin/c++ instead of whatever is set in CXX.

MadhumitaSushil avatar Oct 04 '24 03:10 MadhumitaSushil

Thanks both. We had a weird requirement for an older version of triton. We were able to install by manually building LLVM.

wahabk avatar Oct 07 '24 09:10 wahabk