Llama2 7b model C++ example
Implemented an example for https://huggingface.co/amd/Llama-2-7b-chat-hf-awq-int4-asym-gs128-onnx/tree/main Llama2 7b model with MIGraphX.
Details about the example and description for running is available in README (https://github.com/ROCm/AMDMIGraphX/tree/htec/mgx-llama2-7b-example/examples/transformers/mgx_llama2)
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Additional details and impacted files
@@ Coverage Diff @@
## develop #3666 +/- ##
===========================================
- Coverage 92.23% 92.16% -0.07%
===========================================
Files 514 521 +7
Lines 21746 24610 +2864
===========================================
+ Hits 20057 22681 +2624
- Misses 1689 1929 +240
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
| Test | Batch | Rate new b49963 |
Rate old 4b15b6 |
Diff | Compare |
|---|---|---|---|---|---|
| torchvision-resnet50 | 64 | 3,254.17 | 3,257.89 | -0.11% | :white_check_mark: |
| torchvision-resnet50_fp16 | 64 | 6,988.94 | 6,918.72 | 1.01% | :white_check_mark: |
| torchvision-densenet121 | 32 | 2,432.26 | 2,432.91 | -0.03% | :white_check_mark: |
| torchvision-densenet121_fp16 | 32 | 4,085.65 | 4,086.33 | -0.02% | :white_check_mark: |
| torchvision-inceptionv3 | 32 | 1,627.76 | 1,628.71 | -0.06% | :white_check_mark: |
| torchvision-inceptionv3_fp16 | 32 | 2,742.43 | 2,745.51 | -0.11% | :white_check_mark: |
| cadene-inceptionv4 | 16 | 764.96 | 764.62 | 0.04% | :white_check_mark: |
| cadene-resnext64x4 | 16 | 810.58 | 814.13 | -0.44% | :white_check_mark: |
| slim-mobilenet | 64 | 7,461.51 | 7,458.12 | 0.05% | :white_check_mark: |
| slim-nasnetalarge | 64 | 208.46 | 209.03 | -0.27% | :white_check_mark: |
| slim-resnet50v2 | 64 | 3,440.03 | 3,436.59 | 0.10% | :white_check_mark: |
| bert-mrpc-onnx | 8 | 1,150.36 | 1,150.32 | 0.00% | :white_check_mark: |
| bert-mrpc-tf | 1 | 475.69 | 449.82 | 5.75% | :high_brightness: |
| pytorch-examples-wlang-gru | 1 | 428.15 | 437.44 | -2.12% | :white_check_mark: |
| pytorch-examples-wlang-lstm | 1 | 436.50 | 383.25 | 13.89% | :high_brightness: |
| torchvision-resnet50_1 | 1 | 777.19 | 741.45 | 4.82% | :high_brightness: |
| cadene-dpn92_1 | 1 | 399.33 | 400.49 | -0.29% | :white_check_mark: |
| cadene-resnext101_1 | 1 | 382.84 | 382.70 | 0.04% | :white_check_mark: |
| onnx-taau-downsample | 1 | 345.53 | 345.15 | 0.11% | :white_check_mark: |
| dlrm-criteoterabyte | 1 | 33.35 | 33.31 | 0.11% | :white_check_mark: |
| dlrm-criteoterabyte_fp16 | 1 | 52.51 | 52.72 | -0.40% | :white_check_mark: |
| agentmodel | 1 | 8,059.84 | 8,463.32 | -4.77% | :red_circle: |
| unet_fp16 | 2 | 58.74 | 58.77 | -0.05% | :white_check_mark: |
| resnet50v1_fp16 | 1 | 935.52 | 934.37 | 0.12% | :white_check_mark: |
| resnet50v1_int8 | 1 | 995.64 | 1,030.33 | -3.37% | :red_circle: |
| bert_base_cased_fp16 | 64 | 1,169.76 | 1,170.28 | -0.04% | :white_check_mark: |
| bert_large_uncased_fp16 | 32 | 363.68 | 363.03 | 0.18% | :white_check_mark: |
| bert_large_fp16 | 1 | 199.12 | 199.90 | -0.39% | :white_check_mark: |
| distilgpt2_fp16 | 16 | 2,197.04 | 2,197.64 | -0.03% | :white_check_mark: |
| yolov5s | 1 | 528.96 | 518.78 | 1.96% | :white_check_mark: |
| tinyllama | 1 | 43.34 | 43.68 | -0.79% | :white_check_mark: |
| vicuna-fastchat | 1 | 165.26 | 173.15 | -4.56% | :red_circle: |
| whisper-tiny-encoder | 1 | 418.07 | 417.60 | 0.11% | :white_check_mark: |
| whisper-tiny-decoder | 1 | 432.36 | 429.03 | 0.78% | :white_check_mark: |
This build is not recommended to merge :red_circle:
:red_circle:bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output
/AzurePipelines run
Azure Pipelines successfully started running 1 pipeline(s).