rocm_sdk_builder icon indicating copy to clipboard operation
rocm_sdk_builder copied to clipboard

was there a bad update to onnxruntime?

Open HarshedCodes opened this issue 9 months ago • 4 comments

Hi there, having some problems. Built this project a couple weeks ago and everything went great, even compiled https://github.com/likelovewant/ollama-for-amd against it and loved the results on my 6900hx , gfx1035. Then I did an os reinstall. When I downloaded the repo again I can not get it to build past the onnx runtime package, 040. First it didnt have TARGET_GPUS set, so I ran babs with the -c flag a couple times but no dice. Eventually set it manually. That got a little farther, but triggers this error:

[ 16%] Building CXX object _deps/googletest-build/googletest/CMakeFiles/gtest.dir/src/gtest-all.cc.o In file included from /home/brock/rocm_sdk_builder/src_projects/onnxruntime/build/Linux/Release/_deps/googletest-src/googletest/include/gtest/gtest-assertion-result.h:46, from /home/brock/rocm_sdk_builder/src_projects/onnxruntime/build/Linux/Release/_deps/googletest-src/googletest/include/gtest/gtest.h:63, from /home/brock/rocm_sdk_builder/src_projects/onnxruntime/build/Linux/Release/_deps/googletest-src/googletest/src/gtest-all.cc:38: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/build/Linux/Release/_deps/googletest-src/googletest/include/gtest/gtest-message.h:62:10: fatal error: absl/strings/has_absl_stringify.h: No such file or directory 62 | #include "absl/strings/has_absl_stringify.h" | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. gmake[2]: *** [_deps/googletest-build/googletest/CMakeFiles/gtest.dir/build.make:76: _deps/googletest-build/googletest/CMakeFiles/gtest.dir/src/gtest-all.cc.o] Error 1 gmake[1]: *** [CMakeFiles/Makefile2:4470: _deps/googletest-build/googletest/CMakeFiles/gtest.dir/all] Error 2 gmake[1]: *** Waiting for unfinished jobs....

before eventually spiraling down to the same issues it had with the no gpu target set, dozens/hundreds of errors like:

[ 24%] Building CXX object _deps/protobuf-build/CMakeFiles/libprotoc.dir/src/google/protobuf/compiler/java/primitive_field.cc.o warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/math/unary_elementwise_ops_impl.cu:210: unsupported identifier "__NV_SATFINITE" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/math/unary_elementwise_ops_impl.cu:210: unsupported device function "__nv_cvt_halfraw_to_fp8": return T(static_cast(__nv_cvt_halfraw_to_fp8(v, __NV_SATFINITE, NVT)), T::FromBits());
warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/math/unary_elementwise_ops_impl.cu:216: unsupported identifier "__NV_NOSAT" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/math/unary_elementwise_ops_impl.cu:216: unsupported device function "__nv_cvt_halfraw_to_fp8": return T(static_cast(__nv_cvt_halfraw_to_fp8(v, __NV_NOSAT, NVT)), T::FromBits());
warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/math/unary_elementwise_ops_impl.cu:222: unsupported identifier "__NV_SATFINITE" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/math/unary_elementwise_ops_impl.cu:222: unsupported device function "__nv_cvt_float_to_fp8": return T(static_cast(__nv_cvt_float_to_fp8(v, __NV_SATFINITE, NVT)), T::FromBits());
warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/math/unary_elementwise_ops_impl.cu:228: unsupported identifier "__NV_NOSAT" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/math/unary_elementwise_ops_impl.cu:228: unsupported device function "__nv_cvt_float_to_fp8": return T(static_cast(__nv_cvt_float_to_fp8(v, __NV_NOSAT, NVT)), T::FromBits());
[ 24%] Building CXX object _deps/protobuf-build/CMakeFiles/libprotoc.dir/src/google/protobuf/compiler/java/primitive_field_lite.cc.o warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/math/unary_elementwise_ops_impl.cu:264: unsupported identifier "__NV_E4M3" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/math/unary_elementwise_ops_impl.cu:265: unsupported identifier "__NV_E5M2"

[ 25%] Building HIP object _deps/composable_kernel-build/library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_comp_fp8_km_kn_mn_instance.cpp.o warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:31: unsupported identifier "__NV_E4M3" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:31: unsupported device function "__nv_cvt_fp8_to_halfraw": return __half2float(__nv_cvt_fp8_to_halfraw(v.val, __NV_E4M3)); warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:38: unsupported identifier "__NV_E4M3" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:38: unsupported device function "__nv_cvt_fp8_to_halfraw": return __nv_cvt_fp8_to_halfraw(v.val, __NV_E4M3); warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:45: unsupported identifier "__NV_E5M2" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:45: unsupported device function "__nv_cvt_fp8_to_halfraw": return __half2float(__nv_cvt_fp8_to_halfraw(v.val, __NV_E5M2)); warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:52: unsupported identifier "__NV_E5M2" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:52: unsupported device function "__nv_cvt_fp8_to_halfraw": return __nv_cvt_fp8_to_halfraw(v.val, __NV_E5M2); warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:59: unsupported identifier "__NV_SATFINITE" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:59: unsupported identifier "__NV_NOSAT" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:59: unsupported identifier "__NV_E4M3" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:59: unsupported device function "__nv_cvt_float_to_fp8": return Float8E4M3FN(static_cast(__nv_cvt_float_to_fp8(v, saturate ? __NV_SATFINITE : __NV_NOSAT, __NV_E4M3)), Float8E4M3FN::FromBits()); warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:66: unsupported identifier "__NV_SATFINITE" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:66: unsupported identifier "__NV_NOSAT" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:66: unsupported identifier "__NV_E4M3" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:66: unsupported device function "__nv_cvt_halfraw_to_fp8": return Float8E4M3FN(static_cast(__nv_cvt_halfraw_to_fp8(v, saturate ? __NV_SATFINITE : __NV_NOSAT, __NV_E4M3)), Float8E4M3FN::FromBits()); warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:73: unsupported identifier "__NV_SATFINITE" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:73: unsupported identifier "__NV_NOSAT" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:73: unsupported identifier "__NV_E4M3" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:73: unsupported device function "__nv_cvt_float_to_fp8": return Float8E5M2(static_cast(__nv_cvt_float_to_fp8(v, saturate ? __NV_SATFINITE : __NV_NOSAT, __NV_E4M3)), Float8E5M2::FromBits()); warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:80: unsupported identifier "__NV_SATFINITE" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:80: unsupported identifier "__NV_NOSAT" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:80: unsupported identifier "__NV_E4M3" warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/providers/cuda/tensor/cast_op.cu:80: unsupported device function "__nv_cvt_halfraw_to_fp8": return Float8E5M2(static_cast(__nv_cvt_halfraw_to_fp8(v, saturate ? __NV_SATFINITE : __NV_NOSAT, __NV_E4M3)), Float8E5M2::FromBits());

[ 30%] Hipify: onnxruntime/contrib_ops/cuda/bert/flash_attention/utils.h -> amdgpu/onnxruntime/contrib_ops/rocm/bert/flash_attention/utils.h warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/contrib_ops/cuda/bert/flash_attention/utils.h:125: unsupported device function "__shfl_xor_sync": x = op(x, __shfl_xor_sync(uint32_t(-1), x, OFFSET)); warning: /home/brock/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/contrib_ops/cuda/bert/flash_attention/utils.h:136: unsupported device function "__shfl_xor_sync": x = op(x, __shfl_xor_sync(uint32_t(-1), x, 1));

and failing with:

[ 41%] Built target device_gemm_instance gmake: *** [Makefile:146: all] Error 2 Traceback (most recent call last): File "/home/brock/rocm_sdk_builder/src_projects/onnxruntime/tools/ci_build/build.py", line 2955, in sys.exit(main()) ^^^^^^ File "/home/brock/rocm_sdk_builder/src_projects/onnxruntime/tools/ci_build/build.py", line 2847, in main build_targets(args, cmake_path, build_dir, configs, num_parallel_jobs, args.target) File "/home/brock/rocm_sdk_builder/src_projects/onnxruntime/tools/ci_build/build.py", line 1736, in build_targets run_subprocess(cmd_args, env=env) File "/home/brock/rocm_sdk_builder/src_projects/onnxruntime/tools/ci_build/build.py", line 861, in run_subprocess return run(*args, cwd=cwd, capture_stdout=capture_stdout, shell=shell, env=my_env) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/brock/rocm_sdk_builder/src_projects/onnxruntime/tools/python/util/run.py", line 49, in run completed_process = subprocess.run( ^^^^^^^^^^^^^^^ File "/opt/rocm_sdk_612/lib/python3.11/subprocess.py", line 571, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['/usr/bin/cmake', '--build', '/home/brock/rocm_sdk_builder/src_projects/onnxruntime/build/Linux/Release', '--config', 'Release', '--', '-j16']' returned non-zero exit status 2. build failed: onnxruntime error in build cmd: ./build_rocm.sh /opt/rocm_sdk_612 gfx1035 brock@homelab:~/rocm_sdk_builder$

HarshedCodes avatar Mar 14 '25 19:03 HarshedCodes

Onnxruntime and deepspeed are the last two packages of the list of "core" apps build by default but things like pytorch should already be working.

One thing you could also try is to remove the onnxruntime folder and it's build directory and then trying to rebuild onxxruntime again. This way we could verify that there is no any old pytorch build files that causes your problem.

$ rm -rf builddir/040_02_onnxruntime_deepspeed src_projects/onnxruntime/
$ ./babs.sh -b

If this does not work, can you check

  1. Are examples in /opt/rocm_sdk_612/docs/examples/pytorch/ working for you?
  2. When you rebuild everything, did you also remove /opt/rocm_sdk_612 which contained the old build?
  3. Hmm, could it be possible that you build the rest of the system by accident by using the wrong GPU as a target?

lamikr avatar Mar 15 '25 18:03 lamikr

ok, so after doing a full wipe, pulling down the repo again, building everything, same issue. Let me go down your list, and then share some specifics

  1. yes. They work fine. The pytorch benchmark shows 27 seconds on cpu, 0.4 seconds on GPU

  2. Yes. Nuked it from orbit. And any files in home

  3. no. Ive now built (or tried to) 3 times trying to resolve.

Heres what im seeing. At the start of the onnx package, it correctly identifies gfx1035:

/home/brock/rocm_sdk_builder/builddir/040_01_onnxruntime_rocm_training
[89] Post-configuration: onnxruntime
no post-configuration commands
post-configuration ok: onnxruntime

/home/brock/rocm_sdk_builder/builddir/040_01_onnxruntime_rocm_training
[89] Building: onnxruntime
[0] onnxruntime, build command:
cd /home/brock/rocm_sdk_builder/src_projects/onnxruntime
[1] onnxruntime, build command:
./build_rocm.sh /opt/rocm_sdk_612 gfx1035
using rocm_root_directory specified: /opt/rocm_sdk_612
Using specified amd rocm gpu: gfx1035
Linux distributions cmake version ok
    3.28.3 >= 3.26.1
Linux distributions cmake version ok
    3.28.3 >= 3.26.1
2025-03-15 20:30:41,792 build [DEBUG] - Command line arguments:
  --build_dir /home/brock/rocm_sdk_builder/src_projects/onnxruntime/build/Linux --allow_running_as_root --config Release --enable_training --build_wheel --parallel --skip_tests --build_shared_lib --use_rocm --rocm_home /opt/rocm_sdk_612 --use_migraphx --migraphx_home /opt/rocm_sdk_612 --cmake_extra_defines CMAKE_HIP_COMPILER=/opt/rocm_sdk_612/bin/clang++ CMAKE_INSTALL_PREFIX=/opt/rocm_sdk_612 'CMAKE_HIP_ARCHITECTURES=gfx1035

The first error it runs into is this:

-- The CXX compiler identification is GNU 13.3.0
-- The ASM compiler identification is GNU
-- Found assembler: /usr/bin/cc
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Warning (dev) at CMakeLists.txt:55 (include):
  Policy CMP0145 is not set: The Dart and FindDart modules are removed.  Run
  "cmake --help-policy CMP0145" for policy details.  Use the cmake_policy
  command to set the policy and suppress this warning.

This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at /usr/share/cmake-3.28/Modules/Dart.cmake:47 (message):
  Policy CMP0145 is not set: The Dart and FindDart modules are removed.  Run
  "cmake --help-policy CMP0145" for policy details.  Use the cmake_policy
  command to set the policy and suppress this warning.
Call Stack (most recent call first):
  CMakeLists.txt:55 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- The HIP compiler identification is Clang 17.0.0

Followed shortly by this, but still identified as 1035:

CMAKE_HIP_COMPILER:      /opt/rocm_sdk_612/bin/clang++
CMAKE_HIP_ARCHITECTURES: gfx1035
CMAKE_HIP_FLAGS:         
CMAKE_HIP_FLAGS_RELEASE: -O3 -DNDEBUG
CMake Warning at CMakeLists.txt:442 (message):
  onnxruntime_ENABLE_TRAINING_TORCH_INTEROP is turned OFF due to incompatible
  build combinations.


CMake Warning at CMakeLists.txt:449 (message):
  onnxruntime_ENABLE_TRITON is turned OFF because it's designed to support
  CUDA training on Linux only currently.


-- Performing Test COMPILER_SUPPORT_MF16C
-- Performing Test COMPILER_SUPPORT_MF16C - Success

a bit later theres another of the policy messages:

patching file CMakeLists.txt
patching file onnx/common/file_utils.h
patching file onnx/defs/quantization/defs.cc
patching file onnx/defs/quantization/old.cc
patching file onnx/onnx_pb.h
patching file onnx/shape_inference/implementation.cc
[ 55%] No configure step for 'onnx-populate'
[ 66%] No build step for 'onnx-populate'
[ 77%] No install step for 'onnx-populate'
[ 88%] No test step for 'onnx-populate'
[100%] Completed 'onnx-populate'
[100%] Built target onnx-populate
CMake Deprecation Warning at /home/brock/rocm_sdk_builder/src_projects/onnxruntime/build/Linux/Release/_deps/onnx-src/CMakeLists.txt:2 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


CMake Warning (dev) at /home/brock/rocm_sdk_builder/src_projects/onnxruntime/build/Linux/Release/_deps/onnx-src/CMakeLists.txt:107 (find_package):
  Policy CMP0148 is not set: The FindPythonInterp and FindPythonLibs modules
  are removed.  Run "cmake --help-policy CMP0148" for policy details.  Use
  the cmake_policy command to set the policy and suppress this warning.

This warning is for project developers.  Use -Wno-dev to suppress it.

Then more:

[100%] Completed 'tensorboard-populate'
[100%] Built target tensorboard-populate
CMake Warning at CMakeLists.txt:1605 (message):
  MPI and NCCL are disabled because build is on Windows or USE_NCCL is set to
  OFF.


-- Looking for clock_gettime in rt
-- Looking for clock_gettime in rt - found
-- Python Build is enabled
CMake Warning (dev) at onnxruntime_providers_migraphx.cmake:25 (find_package):
  Policy CMP0144 is not set: find_package uses upper-case <PACKAGENAME>_ROOT
  variables.  Run "cmake --help-policy CMP0144" for policy details.  Use the
  cmake_policy command to set the policy and suppress this warning.

  CMake variable MIGRAPHX_ROOT is set to:

    /opt/rocm_sdk_612

  For compatibility, find_package is ignoring the variable, but code in a
  .cmake module might still use it.
Call Stack (most recent call first):
  onnxruntime_providers.cmake:172 (include)
  CMakeLists.txt:1744 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Looking for migraphx_program_run_async in migraphx::c
-- Looking for migraphx_program_run_async in migraphx::c - found
-- MIGRAPHX GPU STREAM SYNC is ENABLED
CMake Warning (dev) at onnxruntime_rocm_hipify.cmake:170:
  Syntax Warning in cmake code at column 26

  Argument not separated from preceding token by whitespace.
Call Stack (most recent call first):
  onnxruntime_providers_rocm.cmake:5 (include)
  onnxruntime_providers.cmake:184 (include)
  CMakeLists.txt:1744 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at onnxruntime_rocm_hipify.cmake:171:
  Syntax Warning in cmake code at column 25

  Argument not separated from preceding token by whitespace.
Call Stack (most recent call first):
  onnxruntime_providers_rocm.cmake:5 (include)
  onnxruntime_providers.cmake:184 (include)
  CMakeLists.txt:1744 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Found Python3: /opt/rocm_sdk_612/bin/python3 (found version "3.11.11") found components: Interpreter 

And at this point it no longer sees gpu target:

GPU_TARGETS= 
checking which targets are supported
-- Performing Test COMPILER_HAS_TARGET_ID_gfx908
-- Performing Test COMPILER_HAS_TARGET_ID_gfx908 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx940
-- Performing Test COMPILER_HAS_TARGET_ID_gfx940 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx941
-- Performing Test COMPILER_HAS_TARGET_ID_gfx941 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx942
-- Performing Test COMPILER_HAS_TARGET_ID_gfx942 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1010
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1010 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1030
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1030 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1031
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1031 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1032
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1032 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1035
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1035 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1036
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1036 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1100
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1100 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1101
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1101 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1102
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1102 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1103
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1103 - Failed
Supported GPU_TARGETS= 

HarshedCodes avatar Mar 16 '25 17:03 HarshedCodes

I did one update to onnxruntime, can you test again by running:

./babs.sh -up
./babs.sh -b

lamikr avatar Mar 30 '25 21:03 lamikr

I am having

/home/yoni/Downloads/hashcat-6.2.6/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/optimizer/selectors_actions/selector_action_transformer.cc: In function ‘onnxruntime::common::Status onnxruntime::MatchAndProcess(Graph&, const GraphViewer&, Node&, bool&, const logging::Logger&, const std::string&, const SelectorActionRegistry&, const SatRuntimeOptimizationSaveContext*)’:
/home/yoni/Downloads/hashcat-6.2.6/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/optimizer/selectors_actions/selector_action_transformer.cc:150:23: error: loop variable ‘op_schema’ creates a copy from type ‘const gsl::not_null<const onnx::OpSchema*>’ [-Werror=range-loop-construct]
  150 |       for (const auto op_schema : action_saved_state.produced_node_op_schemas) {
      |                       ^~~~~~~~~
/home/yoni/Downloads/hashcat-6.2.6/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/optimizer/selectors_actions/selector_action_transformer.cc:150:23: note: use reference type to prevent copying
  150 |       for (const auto op_schema : action_saved_state.produced_node_op_schemas) {
      |                       ^~~~~~~~~
      | 

and

n/pool.cc.o
/home/yoni/Downloads/hashcat-6.2.6/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/session/inference_session.cc: In member function ‘onnxruntime::common::Status onnxruntime::InferenceSession::SaveToOrtFormat(const onnxruntime::PathString&) const’:
/home/yoni/Downloads/hashcat-6.2.6/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/session/inference_session.cc:852:19: error: loop variable ‘op_schema’ creates a copy from type ‘const gsl::not_null<const onnx::OpSchema*>’ [-Werror=range-loop-construct]
  852 |   for (const auto op_schema : saved_runtime_optimization_produced_node_op_schemas_) {
      |                   ^~~~~~~~~
/home/yoni/Downloads/hashcat-6.2.6/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/session/inference_session.cc:852:19: note: use reference type to prevent copying
  852 |   for (const auto op_schema : saved_runtime_optimization_produced_node_op_schemas_) {
      |                   ^~~~~~~~~
      |                   &
cc1plus: all warnings being treated as errors
gmake[2]: *** [CMakeFiles/onnxruntime_session.dir/build.make:177: CMakeFiles/onnxruntime_session.dir/home/yoni/Downloads/hashcat-6.2.6/rocm_sdk_builder/src_projects/onnxruntime/onnxruntime/core/session/inference_session.cc.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....

error while tryng to build for gfx1150 last lines

gmake: *** [Makefile:146: all] Error 2
Traceback (most recent call last):
  File "/home/yoni/Downloads/hashcat-6.2.6/rocm_sdk_builder/src_projects/onnxruntime/tools/ci_build/build.py", line 2955, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/yoni/Downloads/hashcat-6.2.6/rocm_sdk_builder/src_projects/onnxruntime/tools/ci_build/build.py", line 2847, in main
    build_targets(args, cmake_path, build_dir, configs, num_parallel_jobs, args.target)
  File "/home/yoni/Downloads/hashcat-6.2.6/rocm_sdk_builder/src_projects/onnxruntime/tools/ci_build/build.py", line 1736, in build_targets
    run_subprocess(cmd_args, env=env)
  File "/home/yoni/Downloads/hashcat-6.2.6/rocm_sdk_builder/src_projects/onnxruntime/tools/ci_build/build.py", line 861, in run_subprocess
    return run(*args, cwd=cwd, capture_stdout=capture_stdout, shell=shell, env=my_env)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yoni/Downloads/hashcat-6.2.6/rocm_sdk_builder/src_projects/onnxruntime/tools/python/util/run.py", line 49, in run
    completed_process = subprocess.run(
                        ^^^^^^^^^^^^^^^
  File "/opt/rocm_sdk_612/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/usr/bin/cmake', '--build', '/home/yoni/Downloads/hashcat-6.2.6/rocm_sdk_builder/src_projects/onnxruntime/build/Linux/Release', '--config', 'Release', '--', '-j20']' returned non-zero exit status 2.
build failed: onnxruntime
  error in build cmd: ./build_rocm.sh /opt/rocm_sdk_612 gfx1150

is this possibly related?

yoni13 avatar May 17 '25 18:05 yoni13