AMDMIGraphX icon indicating copy to clipboard operation
AMDMIGraphX copied to clipboard

Add GroupQueryAttention with KV-Cache

Open turneram opened this issue 1 year ago • 3 comments

turneram avatar Sep 06 '24 17:09 turneram

Codecov Report

Attention: Patch coverage is 97.95222% with 6 lines in your changes missing coverage. Please review.

Project coverage is 92.17%. Comparing base (bdbe342) to head (725f34f). Report is 132 commits behind head on develop.

Files with missing lines Patch % Lines
src/include/migraphx/op/group_query_attention.hpp 98.03% 5 Missing :warning:
src/onnx/parse_group_query_attention.cpp 96.87% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3425      +/-   ##
===========================================
+ Coverage    92.08%   92.17%   +0.09%     
===========================================
  Files          510      512       +2     
  Lines        21094    21385     +291     
===========================================
+ Hits         19424    19712     +288     
- Misses        1670     1673       +3     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Sep 13 '24 18:09 codecov[bot]

Test Batch Rate new
725f34
Rate old
a1e339
Diff Compare
torchvision-resnet50 64 3,260.60 3,261.15 -0.02% :white_check_mark:
torchvision-resnet50_fp16 64 6,985.29 6,998.48 -0.19% :white_check_mark:
torchvision-densenet121 32 2,438.58 2,437.74 0.03% :white_check_mark:
torchvision-densenet121_fp16 32 4,107.52 4,082.54 0.61% :white_check_mark:
torchvision-inceptionv3 32 1,639.29 1,637.99 0.08% :white_check_mark:
torchvision-inceptionv3_fp16 32 2,762.34 2,759.69 0.10% :white_check_mark:
cadene-inceptionv4 16 776.94 777.37 -0.05% :white_check_mark:
cadene-resnext64x4 16 810.02 809.89 0.02% :white_check_mark:
slim-mobilenet 64 7,535.95 7,538.41 -0.03% :white_check_mark:
slim-nasnetalarge 64 211.84 211.83 0.01% :white_check_mark:
slim-resnet50v2 64 3,504.66 3,505.62 -0.03% :white_check_mark:
bert-mrpc-onnx 8 1,148.36 1,153.97 -0.49% :white_check_mark:
bert-mrpc-tf 1 465.22 463.62 0.35% :white_check_mark:
pytorch-examples-wlang-gru 1 423.64 416.54 1.71% :white_check_mark:
pytorch-examples-wlang-lstm 1 474.21 375.99 26.12% :high_brightness:
torchvision-resnet50_1 1 783.58 787.35 -0.48% :white_check_mark:
cadene-dpn92_1 1 400.96 398.81 0.54% :white_check_mark:
cadene-resnext101_1 1 384.06 382.80 0.33% :white_check_mark:
onnx-taau-downsample 1 342.41 342.52 -0.03% :white_check_mark:
dlrm-criteoterabyte 1 33.33 33.35 -0.07% :white_check_mark:
dlrm-criteoterabyte_fp16 1 52.77 52.52 0.49% :white_check_mark:
agentmodel 1 10,167.28 8,468.76 20.06% :high_brightness:
unet_fp16 2 58.94 58.99 -0.08% :white_check_mark:
resnet50v1_fp16 1 953.52 916.35 4.06% :high_brightness:
resnet50v1_int8 1 980.77 972.58 0.84% :white_check_mark:
bert_base_cased_fp16 64 1,172.32 1,170.84 0.13% :white_check_mark:
bert_large_uncased_fp16 32 363.60 363.63 -0.01% :white_check_mark:
bert_large_fp16 1 201.32 199.03 1.15% :white_check_mark:
distilgpt2_fp16 16 2,204.29 2,203.30 0.05% :white_check_mark:
yolov5s 1 531.15 528.92 0.42% :white_check_mark:
tinyllama 1 43.42 43.76 -0.78% :white_check_mark:
vicuna-fastchat 1 171.63 173.75 -1.22% :white_check_mark:
whisper-tiny-encoder 1 417.81 418.39 -0.14% :white_check_mark:
whisper-tiny-decoder 1 427.35 427.20 0.04% :white_check_mark:

Check results before merge :high_brightness:

migraphx-bot avatar Oct 11 '24 06:10 migraphx-bot


     :white_check_mark: bert-mrpc-onnx: PASSED: MIGraphX meets tolerance
     :white_check_mark: bert-mrpc-tf: PASSED: MIGraphX meets tolerance
     :white_check_mark: pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance
     :white_check_mark: pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance
     :white_check_mark: torchvision-resnet50_1: PASSED: MIGraphX meets tolerance
     :white_check_mark: cadene-dpn92_1: PASSED: MIGraphX meets tolerance
     :white_check_mark: cadene-resnext101_1: PASSED: MIGraphX meets tolerance
     :white_check_mark: dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance
     :white_check_mark: agentmodel: PASSED: MIGraphX meets tolerance
     :white_check_mark: unet: PASSED: MIGraphX meets tolerance
     :white_check_mark: resnet50v1: PASSED: MIGraphX meets tolerance
     :white_check_mark: bert_base_cased_fp16: PASSED: MIGraphX meets tolerance
:red_circle:bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

     :white_check_mark: bert_large: PASSED: MIGraphX meets tolerance
     :white_check_mark: yolov5s: PASSED: MIGraphX meets tolerance
     :white_check_mark: tinyllama: PASSED: MIGraphX meets tolerance
     :white_check_mark: vicuna-fastchat: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-tiny-encoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-tiny-decoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: distilgpt2_fp16: PASSED: MIGraphX meets tolerance

migraphx-bot avatar Oct 11 '24 06:10 migraphx-bot