AMDMIGraphX
AMDMIGraphX copied to clipboard
[Experimental] AIMIGRAPHX-235 TF dynamic dims support
- allows for passing in dynamic input dims to existing tf models
- modified some operators to make them dynamic, more remaining as TODOs
Codecov Report
Attention: Patch coverage is 35.29412% with 22 lines in your changes missing coverage. Please review.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/api/api.cpp | 0.00% | 12 Missing :warning: |
| src/tf/tf_parser.cpp | 27.27% | 8 Missing :warning: |
| src/tf/parse_slice.cpp | 75.00% | 1 Missing :warning: |
| src/tf/tf.cpp | 50.00% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## develop #4136 +/- ##
===========================================
- Coverage 92.21% 92.15% -0.06%
===========================================
Files 545 547 +2
Lines 25107 25207 +100
===========================================
+ Hits 23152 23228 +76
- Misses 1955 1979 +24
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/api/include/migraphx/migraphx.hpp | 98.97% <ø> (ø) |
|
| src/tf/include/migraphx/tf/tf_parser.hpp | 100.00% <ø> (ø) |
|
| src/tf/parse_batchnorm.cpp | 100.00% <ø> (ø) |
|
| src/tf/parse_matmul.cpp | 90.91% <100.00%> (+0.43%) |
:arrow_up: |
| src/tf/parse_pack.cpp | 93.33% <100.00%> (ø) |
|
| src/tf/parse_pad.cpp | 100.00% <100.00%> (ø) |
|
| src/tf/parse_softmax.cpp | 90.00% <100.00%> (ø) |
|
| src/tf/parse_squeeze.cpp | 100.00% <ø> (ø) |
|
| src/tf/parse_slice.cpp | 95.00% <75.00%> (-5.00%) |
:arrow_down: |
| src/tf/tf.cpp | 72.73% <50.00%> (-2.27%) |
:arrow_down: |
| ... and 2 more |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
| Test | Batch | Rate new 6b9e2b |
Rate old 3116c7 |
Diff | Compare |
|---|---|---|---|---|---|
| torchvision-resnet50 | 64 | 3,231.62 | 3,232.24 | -0.02% | :white_check_mark: |
| torchvision-resnet50_fp16 | 64 | 6,906.30 | 6,901.97 | 0.06% | :white_check_mark: |
| torchvision-densenet121 | 32 | 2,439.77 | 2,441.26 | -0.06% | :white_check_mark: |
| torchvision-densenet121_fp16 | 32 | 4,189.59 | 4,183.95 | 0.13% | :white_check_mark: |
| torchvision-inceptionv3 | 32 | 1,628.86 | 1,628.18 | 0.04% | :white_check_mark: |
| torchvision-inceptionv3_fp16 | 32 | 2,744.38 | 2,747.37 | -0.11% | :white_check_mark: |
| cadene-inceptionv4 | 16 | 767.06 | 766.97 | 0.01% | :white_check_mark: |
| cadene-resnext64x4 | 16 | 814.06 | 813.79 | 0.03% | :white_check_mark: |
| slim-mobilenet | 64 | 7,434.26 | 7,430.84 | 0.05% | :white_check_mark: |
| slim-nasnetalarge | 64 | 210.06 | 210.03 | 0.01% | :white_check_mark: |
| slim-resnet50v2 | 64 | 3,327.98 | 3,325.76 | 0.07% | :white_check_mark: |
| bert-mrpc-onnx | 8 | 1,140.49 | 1,143.14 | -0.23% | :white_check_mark: |
| bert-mrpc-tf | 1 | 457.90 | 458.44 | -0.12% | :white_check_mark: |
| pytorch-examples-wlang-gru | 1 | 347.66 | 343.86 | 1.11% | :white_check_mark: |
| pytorch-examples-wlang-lstm | 1 | 481.12 | 474.60 | 1.37% | :white_check_mark: |
| torchvision-resnet50_1 | 1 | 801.90 | 789.53 | 1.57% | :white_check_mark: |
| cadene-dpn92_1 | 1 | 415.26 | 419.72 | -1.06% | :white_check_mark: |
| cadene-resnext101_1 | 1 | 387.71 | 387.56 | 0.04% | :white_check_mark: |
| onnx-taau-downsample | 1 | 395.15 | 394.23 | 0.23% | :white_check_mark: |
| dlrm-criteoterabyte | 1 | 33.68 | 33.67 | 0.04% | :white_check_mark: |
| dlrm-criteoterabyte_fp16 | 1 | 51.19 | 51.07 | 0.24% | :white_check_mark: |
| agentmodel | 1 | 10,351.43 | 10,510.49 | -1.51% | :white_check_mark: |
| unet_fp16 | 2 | 60.60 | 60.59 | 0.03% | :white_check_mark: |
| resnet50v1_fp16 | 1 | 1,039.91 | 1,045.09 | -0.50% | :white_check_mark: |
| resnet50v1_int8 | 1 | 1,058.99 | 1,069.22 | -0.96% | :white_check_mark: |
| bert_base_cased_fp16 | 64 | 1,164.34 | 1,163.30 | 0.09% | :white_check_mark: |
| bert_large_uncased_fp16 | 32 | 359.52 | 359.63 | -0.03% | :white_check_mark: |
| bert_large_fp16 | 1 | 203.41 | 202.87 | 0.27% | :white_check_mark: |
| distilgpt2_fp16 | 16 | 2,227.20 | 2,229.71 | -0.11% | :white_check_mark: |
| yolov5s | 1 | 538.17 | 544.90 | -1.24% | :white_check_mark: |
| tinyllama | 1 | 43.80 | 43.85 | -0.10% | :white_check_mark: |
| vicuna-fastchat | 1 | 44.93 | 44.97 | -0.10% | :white_check_mark: |
| whisper-tiny-encoder | 1 | 417.78 | 417.62 | 0.04% | :white_check_mark: |
| whisper-tiny-decoder | 1 | 409.84 | 401.82 | 1.99% | :white_check_mark: |
| llama2_7b | 1 | 19.15 | 19.13 | 0.11% | :white_check_mark: |
| qwen1.5-7b | 1 | 23.54 | 23.53 | 0.04% | :white_check_mark: |
| phi3-3.8b | 1 | 26.78 | 26.75 | 0.11% | :white_check_mark: |
| mask-rcnn | 1 | 12.80 | 12.80 | -0.01% | :white_check_mark: |
| llama3-8b | 1 | 21.76 | 21.76 | -0.02% | :white_check_mark: |
| whisper-large-encoder | 1 | 10.17 | 10.17 | 0.03% | :white_check_mark: |
| whisper-large-decoder | 1 | 103.69 | 104.54 | -0.81% | :white_check_mark: |
| mistral-7b | 1 | 23.83 | 23.78 | 0.20% | :white_check_mark: |
| FLUX.1-schnell | 1 | 773.10 | 771.60 | 0.19% | :white_check_mark: |
| nan | nan | nan | nan | nan% | :x: |
This build is not recommended to merge :red_circle:
:x:bert-mrpc-tf: ERROR - check error output
2025-07-16 13:26:12.204727: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1752690377.641847 181551 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 62973 MB memory: -> device: 0, name: AMD Instinct MI250X/MI250, pci bus id: 0000:32:00.0
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1752690378.539971 181551 mlir_graph_optimization_pass.cc:401] MLIR V1 optimization pass is not enabled
2025-07-16 13:26:28.130968: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-07-16 13:26:28.131026: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-07-16 13:26:28.131081: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-07-16 13:26:28.131277: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-07-16 13:26:28.131327: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-07-16 13:26:28.131361: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-07-16 13:26:28.131410: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-07-16 13:26:28.131460: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
2025-07-16 13:26:28.132791: E tensorflow/compiler/mlir/tools/kernel_gen/tf_framework_c_interface.cc:228] INTERNAL: Generating device code failed.
2025-07-16 13:26:28.134234: W tensorflow/core/framework/op_kernel.cc:1829] UNKNOWN: JIT compilation failed.
2025-07-16 13:26:28.134260: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
2025-07-16 13:26:28.134275: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
[[import/loss/output/_21]]
2025-07-16 13:26:28.134291: I tensorflow/core/framework/local_rendezvous.cc:424] Local rendezvous recv item cancelled. Key hash: 11217777527359497193
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1407, in _do_call
return fn(*args)
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1390, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1483, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
(0) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
[[import/loss/output/_21]]
(1) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
0 successful operations.
0 derived errors ignored.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 359, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 335, in main
y_out = sess.run(y, feed_dict=tf_dict)
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 977, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1220, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1400, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1426, in _do_call
raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.UnknownError: Graph execution error:
Detected at node 'import/bert/embeddings/LayerNorm/moments/SquaredDifference' defined at (most recent call last):
Node: 'import/bert/embeddings/LayerNorm/moments/SquaredDifference'
Detected at node 'import/bert/embeddings/LayerNorm/moments/SquaredDifference' defined at (most recent call last):
Node: 'import/bert/embeddings/LayerNorm/moments/SquaredDifference'
2 root error(s) found.
(0) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
[[import/loss/output/_21]]
(1) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
0 successful operations.
0 derived errors ignored.
Original stack trace for 'import/bert/embeddings/LayerNorm/moments/SquaredDifference'::red_circle:unet: FAILED: MIGraphX is not within tolerance - check verbose output
:red_circle:bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output
:red_circle:mask-rcnn: FAILED: MIGraphX is not within tolerance - check verbose output