AMDMIGraphX icon indicating copy to clipboard operation
AMDMIGraphX copied to clipboard

SparseAttention ONNX Contrib Op Implementation

Open music-dino opened this issue 3 months ago • 2 comments

music-dino avatar Sep 03 '25 12:09 music-dino

Test Batch Rate new
2ed947
Rate old
8177ed
Diff Compare
torchvision-resnet50 64 3,175.23 3,156.64 0.59% :white_check_mark:
torchvision-resnet50_fp16 64 6,610.72 6,585.90 0.38% :white_check_mark:
torchvision-densenet121 32 2,444.37 2,434.16 0.42% :white_check_mark:
torchvision-densenet121_fp16 32 4,114.20 4,100.96 0.32% :white_check_mark:
torchvision-inceptionv3 32 1,672.64 1,664.47 0.49% :white_check_mark:
torchvision-inceptionv3_fp16 32 2,596.43 2,579.29 0.66% :white_check_mark:
cadene-inceptionv4 16 797.69 794.64 0.38% :white_check_mark:
cadene-resnext64x4 16 807.08 802.37 0.59% :white_check_mark:
slim-mobilenet 64 8,237.03 8,205.30 0.39% :white_check_mark:
slim-nasnetalarge 64 222.79 221.58 0.55% :white_check_mark:
slim-resnet50v2 64 3,308.52 3,295.13 0.41% :white_check_mark:
bert-mrpc-onnx 8 1,143.12 1,131.65 1.01% :white_check_mark:
bert-mrpc-tf 1 479.43 478.53 0.19% :white_check_mark:
pytorch-examples-wlang-gru 1 295.97 294.77 0.41% :white_check_mark:
pytorch-examples-wlang-lstm 1 405.78 409.45 -0.90% :white_check_mark:
torchvision-resnet50_1 1 793.98 800.17 -0.77% :white_check_mark:
cadene-dpn92_1 1 413.65 411.44 0.54% :white_check_mark:
cadene-resnext101_1 1 369.96 368.48 0.40% :white_check_mark:
onnx-taau-downsample 1 398.54 397.45 0.27% :white_check_mark:
dlrm-criteoterabyte 1 32.04 31.90 0.45% :white_check_mark:
dlrm-criteoterabyte_fp16 1 51.02 50.96 0.12% :white_check_mark:
agentmodel 1 9,366.63 9,103.57 2.89% :white_check_mark:
unet_fp16 2 58.93 58.78 0.27% :white_check_mark:
resnet50v1_fp16 1 963.57 951.81 1.24% :white_check_mark:
resnet50v1_int8 1 968.24 969.07 -0.09% :white_check_mark:
bert_base_cased_fp16 64 1,114.37 1,109.23 0.46% :white_check_mark:
bert_large_uncased_fp16 32 345.55 343.63 0.56% :white_check_mark:
bert_large_fp16 1 196.66 196.18 0.24% :white_check_mark:
distilgpt2_fp16 16 2,106.52 2,093.09 0.64% :white_check_mark:
yolov5s 1 580.47 580.29 0.03% :white_check_mark:
tinyllama 1 43.95 43.78 0.39% :white_check_mark:
vicuna-fastchat 1 45.26 45.11 0.34% :white_check_mark:
whisper-tiny-encoder 1 411.37 409.17 0.54% :white_check_mark:
whisper-tiny-decoder 1 412.82 411.02 0.44% :white_check_mark:
llama2_7b 1 19.17 19.11 0.30% :white_check_mark:
qwen1.5-7b 1 23.51 23.42 0.42% :white_check_mark:
phi3-3.8b 1 26.67 26.58 0.35% :white_check_mark:
mask-rcnn 1 11.93 11.96 -0.23% :white_check_mark:
llama3-8b 1 21.74 21.67 0.29% :white_check_mark:
whisper-large-encoder 1 10.22 10.17 0.51% :white_check_mark:
whisper-large-decoder 1 96.57 95.77 0.83% :white_check_mark:
mistral-7b 1 23.73 23.63 0.40% :white_check_mark:
FLUX.1-schnell 1 708.46 702.58 0.84% :white_check_mark:
nan nan nan nan nan% :x:

This build is not recommended to merge :red_circle:

migraphx-bot avatar Sep 03 '25 16:09 migraphx-bot


     :white_check_mark: bert-mrpc-onnx: PASSED: MIGraphX meets tolerance
:x:bert-mrpc-tf: ERROR - check error output2025-09-03 10:20:56.197188: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 359, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 306, in main
graph = load_tf_graph(model_name)
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 300, in load_tf_graph
graph_def.ParseFromString(f.read())
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/lib/io/file_io.py", line 116, in read
self._preread_check()
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/lib/io/file_io.py", line 77, in _preread_check
self._read_buf = _pywrap_file_io.BufferedInputStream(
tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme '[local]' not implemented (file: '/new-saved-models/tf-misc/bert_mrpc1.pb')

     :white_check_mark: pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance
     :white_check_mark: pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance
     :white_check_mark: dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance
     :white_check_mark: agentmodel: PASSED: MIGraphX meets tolerance
     :white_check_mark: unet: PASSED: MIGraphX meets tolerance
     :white_check_mark: resnet50v1: PASSED: MIGraphX meets tolerance
     :white_check_mark: bert_base_cased_fp16: PASSED: MIGraphX meets tolerance
:red_circle:bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

     :white_check_mark: bert_large: PASSED: MIGraphX meets tolerance
     :white_check_mark: yolov5s: PASSED: MIGraphX meets tolerance
     :white_check_mark: tinyllama: PASSED: MIGraphX meets tolerance
     :white_check_mark: vicuna-fastchat: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-tiny-encoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-tiny-decoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: distilgpt2_fp16: PASSED: MIGraphX meets tolerance
     :white_check_mark: llama2_7b: PASSED: MIGraphX meets tolerance
     :white_check_mark: qwen1.5-7b: PASSED: MIGraphX meets tolerance
     :white_check_mark: phi3-3.8b: PASSED: MIGraphX meets tolerance
:red_circle:mask-rcnn: FAILED: MIGraphX is not within tolerance - check verbose output

     :white_check_mark: llama3-8b: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-large-decoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: mistral-7b: PASSED: MIGraphX meets tolerance
     :white_check_mark: FLUX.1-schnell: PASSED: MIGraphX meets tolerance

migraphx-bot avatar Sep 03 '25 16:09 migraphx-bot