AMDMIGraphX icon indicating copy to clipboard operation
AMDMIGraphX copied to clipboard

Set attribute to help bypass the warning about amdgpu_waves_per_eu

Open lakhinderwalia opened this issue 5 months ago • 2 comments

The upgraded tool chain is giving a new compile warning that needs to be bypassed for the topk test to successfully compile, and run.

[ RUN ] test_topk<migraphx::shape::half_type, 1000, 120000> /tmp/comgr-d7a292/input/main.cpp:11:22: error: failed to meet occupancy target given by 'amdgpu-waves-per-eu' in 'topk_kernel': desired occupancy was 2, final occupancy is 1 [-Werror,-Wpass-failed]

lakhinderwalia avatar Aug 08 '25 19:08 lakhinderwalia

Test Batch Rate new
918283
Rate old
018cae
Diff Compare
torchvision-resnet50 64 3,247.40 3,248.22 -0.03% :white_check_mark:
torchvision-resnet50_fp16 64 6,961.71 6,961.14 0.01% :white_check_mark:
torchvision-densenet121 32 2,450.76 2,449.97 0.03% :white_check_mark:
torchvision-densenet121_fp16 32 4,170.57 4,164.44 0.15% :white_check_mark:
torchvision-inceptionv3 32 1,636.60 1,636.10 0.03% :white_check_mark:
torchvision-inceptionv3_fp16 32 2,760.19 2,752.74 0.27% :white_check_mark:
cadene-inceptionv4 16 771.57 771.22 0.04% :white_check_mark:
cadene-resnext64x4 16 818.91 818.68 0.03% :white_check_mark:
slim-mobilenet 64 7,460.56 7,459.63 0.01% :white_check_mark:
slim-nasnetalarge 64 211.06 211.08 -0.01% :white_check_mark:
slim-resnet50v2 64 3,344.50 3,342.04 0.07% :white_check_mark:
bert-mrpc-onnx 8 1,145.14 1,145.42 -0.02% :white_check_mark:
bert-mrpc-tf 1 445.65 442.53 0.70% :white_check_mark:
pytorch-examples-wlang-gru 1 294.77 298.60 -1.28% :white_check_mark:
pytorch-examples-wlang-lstm 1 404.81 411.85 -1.71% :white_check_mark:
torchvision-resnet50_1 1 767.18 767.54 -0.05% :white_check_mark:
cadene-dpn92_1 1 386.19 392.42 -1.59% :white_check_mark:
cadene-resnext101_1 1 392.00 393.79 -0.45% :white_check_mark:
onnx-taau-downsample 1 395.74 395.77 -0.01% :white_check_mark:
dlrm-criteoterabyte 1 33.76 33.77 -0.03% :white_check_mark:
dlrm-criteoterabyte_fp16 1 51.24 51.25 -0.02% :white_check_mark:
agentmodel 1 8,334.03 8,931.04 -6.68% :red_circle:
unet_fp16 2 59.14 59.14 0.01% :white_check_mark:
resnet50v1_fp16 1 980.63 978.22 0.25% :white_check_mark:
resnet50v1_int8 1 1,030.81 1,025.85 0.48% :white_check_mark:
bert_base_cased_fp16 64 1,107.47 1,107.56 -0.01% :white_check_mark:
bert_large_uncased_fp16 32 345.29 345.42 -0.04% :white_check_mark:
bert_large_fp16 1 197.48 196.96 0.26% :white_check_mark:
distilgpt2_fp16 16 2,117.26 2,118.48 -0.06% :white_check_mark:
yolov5s 1 566.10 570.35 -0.75% :white_check_mark:
tinyllama 1 43.95 43.98 -0.07% :white_check_mark:
vicuna-fastchat 1 45.27 45.38 -0.22% :white_check_mark:
whisper-tiny-encoder 1 417.62 417.79 -0.04% :white_check_mark:
whisper-tiny-decoder 1 400.51 409.99 -2.31% :white_check_mark:
llama2_7b 1 19.16 19.16 0.00% :white_check_mark:
qwen1.5-7b 1 23.53 23.54 -0.02% :white_check_mark:
phi3-3.8b 1 26.70 26.67 0.10% :white_check_mark:
mask-rcnn 1 12.51 12.44 0.57% :white_check_mark:
llama3-8b 1 21.72 21.73 -0.05% :white_check_mark:
whisper-large-encoder 1 10.22 10.22 -0.01% :white_check_mark:
whisper-large-decoder 1 96.60 96.35 0.26% :white_check_mark:
mistral-7b 1 23.73 23.74 -0.05% :white_check_mark:
FLUX.1-schnell 1 738.85 742.55 -0.50% :white_check_mark:
nan nan nan nan nan% :x:

This build is not recommended to merge :red_circle:

migraphx-bot avatar Aug 08 '25 22:08 migraphx-bot


     :white_check_mark: bert-mrpc-onnx: PASSED: MIGraphX meets tolerance
:x:bert-mrpc-tf: ERROR - check error outputerror: unknown warning option '-Wnrvo' [-Werror,-Wunknown-warning-option]

error: unknown warning option '-Wnrvo' [-Werror,-Wunknown-warning-option]

2025-08-08 16:42:13.363812: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1754689338.796337 173517 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 62951 MB memory: -> device: 0, name: AMD Instinct MI250X/MI250, pci bus id: 0000:b3:00.0
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1754689339.695586 173517 mlir_graph_optimization_pass.cc:401] MLIR V1 optimization pass is not enabled
2025-08-08 16:42:28.331802: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 16:42:28.331993: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 16:42:28.332063: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 16:42:28.332117: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 16:42:28.332162: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 16:42:28.332192: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 16:42:28.332243: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 16:42:28.332293: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
2025-08-08 16:42:28.333307: E tensorflow/compiler/mlir/tools/kernel_gen/tf_framework_c_interface.cc:228] INTERNAL: Generating device code failed.
2025-08-08 16:42:28.334401: W tensorflow/core/framework/op_kernel.cc:1829] UNKNOWN: JIT compilation failed.
2025-08-08 16:42:28.334420: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
2025-08-08 16:42:28.334431: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
[[import/loss/output/_21]]
2025-08-08 16:42:28.334448: I tensorflow/core/framework/local_rendezvous.cc:424] Local rendezvous recv item cancelled. Key hash: 11217777527359497193
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1407, in _do_call
return fn(*args)
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1390, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1483, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
(0) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
[[import/loss/output/_21]]
(1) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 359, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 335, in main
y_out = sess.run(y, feed_dict=tf_dict)
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 977, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1220, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1400, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1426, in _do_call
raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.UnknownError: Graph execution error:

Detected at node 'import/bert/embeddings/LayerNorm/moments/SquaredDifference' defined at (most recent call last):
Node: 'import/bert/embeddings/LayerNorm/moments/SquaredDifference'
Detected at node 'import/bert/embeddings/LayerNorm/moments/SquaredDifference' defined at (most recent call last):
Node: 'import/bert/embeddings/LayerNorm/moments/SquaredDifference'
2 root error(s) found.
(0) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
[[import/loss/output/_21]]
(1) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'import/bert/embeddings/LayerNorm/moments/SquaredDifference':

     :white_check_mark: pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance
     :white_check_mark: pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance
     :white_check_mark: dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance
     :white_check_mark: agentmodel: PASSED: MIGraphX meets tolerance
:red_circle:unet: FAILED: MIGraphX is not within tolerance - check verbose output

     :white_check_mark: resnet50v1: PASSED: MIGraphX meets tolerance
     :white_check_mark: bert_base_cased_fp16: PASSED: MIGraphX meets tolerance
:red_circle:bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

     :white_check_mark: bert_large: PASSED: MIGraphX meets tolerance
     :white_check_mark: yolov5s: PASSED: MIGraphX meets tolerance
     :white_check_mark: tinyllama: PASSED: MIGraphX meets tolerance
     :white_check_mark: vicuna-fastchat: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-tiny-encoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-tiny-decoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: distilgpt2_fp16: PASSED: MIGraphX meets tolerance
     :white_check_mark: llama2_7b: PASSED: MIGraphX meets tolerance
     :white_check_mark: qwen1.5-7b: PASSED: MIGraphX meets tolerance
     :white_check_mark: phi3-3.8b: PASSED: MIGraphX meets tolerance
:red_circle:mask-rcnn: FAILED: MIGraphX is not within tolerance - check verbose output

     :white_check_mark: llama3-8b: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-large-decoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: mistral-7b: PASSED: MIGraphX meets tolerance
     :white_check_mark: FLUX.1-schnell: PASSED: MIGraphX meets tolerance

migraphx-bot avatar Aug 08 '25 22:08 migraphx-bot