Add GroupQueryAttention with KV-Cache
Codecov Report
Attention: Patch coverage is 97.95222% with 6 lines in your changes missing coverage. Please review.
Project coverage is 92.17%. Comparing base (
bdbe342) to head (725f34f). Report is 132 commits behind head on develop.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/include/migraphx/op/group_query_attention.hpp | 98.03% | 5 Missing :warning: |
| src/onnx/parse_group_query_attention.cpp | 96.87% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## develop #3425 +/- ##
===========================================
+ Coverage 92.08% 92.17% +0.09%
===========================================
Files 510 512 +2
Lines 21094 21385 +291
===========================================
+ Hits 19424 19712 +288
- Misses 1670 1673 +3
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
| Test | Batch | Rate new 725f34 |
Rate old a1e339 |
Diff | Compare |
|---|---|---|---|---|---|
| torchvision-resnet50 | 64 | 3,260.60 | 3,261.15 | -0.02% | :white_check_mark: |
| torchvision-resnet50_fp16 | 64 | 6,985.29 | 6,998.48 | -0.19% | :white_check_mark: |
| torchvision-densenet121 | 32 | 2,438.58 | 2,437.74 | 0.03% | :white_check_mark: |
| torchvision-densenet121_fp16 | 32 | 4,107.52 | 4,082.54 | 0.61% | :white_check_mark: |
| torchvision-inceptionv3 | 32 | 1,639.29 | 1,637.99 | 0.08% | :white_check_mark: |
| torchvision-inceptionv3_fp16 | 32 | 2,762.34 | 2,759.69 | 0.10% | :white_check_mark: |
| cadene-inceptionv4 | 16 | 776.94 | 777.37 | -0.05% | :white_check_mark: |
| cadene-resnext64x4 | 16 | 810.02 | 809.89 | 0.02% | :white_check_mark: |
| slim-mobilenet | 64 | 7,535.95 | 7,538.41 | -0.03% | :white_check_mark: |
| slim-nasnetalarge | 64 | 211.84 | 211.83 | 0.01% | :white_check_mark: |
| slim-resnet50v2 | 64 | 3,504.66 | 3,505.62 | -0.03% | :white_check_mark: |
| bert-mrpc-onnx | 8 | 1,148.36 | 1,153.97 | -0.49% | :white_check_mark: |
| bert-mrpc-tf | 1 | 465.22 | 463.62 | 0.35% | :white_check_mark: |
| pytorch-examples-wlang-gru | 1 | 423.64 | 416.54 | 1.71% | :white_check_mark: |
| pytorch-examples-wlang-lstm | 1 | 474.21 | 375.99 | 26.12% | :high_brightness: |
| torchvision-resnet50_1 | 1 | 783.58 | 787.35 | -0.48% | :white_check_mark: |
| cadene-dpn92_1 | 1 | 400.96 | 398.81 | 0.54% | :white_check_mark: |
| cadene-resnext101_1 | 1 | 384.06 | 382.80 | 0.33% | :white_check_mark: |
| onnx-taau-downsample | 1 | 342.41 | 342.52 | -0.03% | :white_check_mark: |
| dlrm-criteoterabyte | 1 | 33.33 | 33.35 | -0.07% | :white_check_mark: |
| dlrm-criteoterabyte_fp16 | 1 | 52.77 | 52.52 | 0.49% | :white_check_mark: |
| agentmodel | 1 | 10,167.28 | 8,468.76 | 20.06% | :high_brightness: |
| unet_fp16 | 2 | 58.94 | 58.99 | -0.08% | :white_check_mark: |
| resnet50v1_fp16 | 1 | 953.52 | 916.35 | 4.06% | :high_brightness: |
| resnet50v1_int8 | 1 | 980.77 | 972.58 | 0.84% | :white_check_mark: |
| bert_base_cased_fp16 | 64 | 1,172.32 | 1,170.84 | 0.13% | :white_check_mark: |
| bert_large_uncased_fp16 | 32 | 363.60 | 363.63 | -0.01% | :white_check_mark: |
| bert_large_fp16 | 1 | 201.32 | 199.03 | 1.15% | :white_check_mark: |
| distilgpt2_fp16 | 16 | 2,204.29 | 2,203.30 | 0.05% | :white_check_mark: |
| yolov5s | 1 | 531.15 | 528.92 | 0.42% | :white_check_mark: |
| tinyllama | 1 | 43.42 | 43.76 | -0.78% | :white_check_mark: |
| vicuna-fastchat | 1 | 171.63 | 173.75 | -1.22% | :white_check_mark: |
| whisper-tiny-encoder | 1 | 417.81 | 418.39 | -0.14% | :white_check_mark: |
| whisper-tiny-decoder | 1 | 427.35 | 427.20 | 0.04% | :white_check_mark: |
Check results before merge :high_brightness:
:red_circle:bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output