AMDMIGraphX Llama2 7b model C++ example

Implemented an example for https://huggingface.co/amd/Llama-2-7b-chat-hf-awq-int4-asym-gs128-onnx/tree/main Llama2 7b model with MIGraphX.

Details about the example and description for running is available in README (https://github.com/ROCm/AMDMIGraphX/tree/htec/mgx-llama2-7b-example/examples/transformers/mgx_llama2)

Nov 29 '24 13:11 ototh-htec

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #3666      +/-   ##
===========================================
- Coverage    92.23%   92.16%   -0.07%     
===========================================
  Files          514      521       +7     
  Lines        21746    24610    +2864     
===========================================
+ Hits         20057    22681    +2624     
- Misses        1689     1929     +240

see 39 files with indirect coverage changes

:rocket: New features to boost your workflow:

:snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Nov 29 '24 15:11 codecov[bot]

Test Batch Rate new
b49963 Rate old
4b15b6 Diff Compare

torchvision-resnet50 64 3,254.17 3,257.89 -0.11% :white_check_mark:

torchvision-resnet50_fp16 64 6,988.94 6,918.72 1.01% :white_check_mark:

torchvision-densenet121 32 2,432.26 2,432.91 -0.03% :white_check_mark:

torchvision-densenet121_fp16 32 4,085.65 4,086.33 -0.02% :white_check_mark:

torchvision-inceptionv3 32 1,627.76 1,628.71 -0.06% :white_check_mark:

torchvision-inceptionv3_fp16 32 2,742.43 2,745.51 -0.11% :white_check_mark:

cadene-inceptionv4 16 764.96 764.62 0.04% :white_check_mark:

cadene-resnext64x4 16 810.58 814.13 -0.44% :white_check_mark:

slim-mobilenet 64 7,461.51 7,458.12 0.05% :white_check_mark:

slim-nasnetalarge 64 208.46 209.03 -0.27% :white_check_mark:

slim-resnet50v2 64 3,440.03 3,436.59 0.10% :white_check_mark:

bert-mrpc-onnx 8 1,150.36 1,150.32 0.00% :white_check_mark:

bert-mrpc-tf 1 475.69 449.82 5.75% :high_brightness:

pytorch-examples-wlang-gru 1 428.15 437.44 -2.12% :white_check_mark:

pytorch-examples-wlang-lstm 1 436.50 383.25 13.89% :high_brightness:

torchvision-resnet50_1 1 777.19 741.45 4.82% :high_brightness:

cadene-dpn92_1 1 399.33 400.49 -0.29% :white_check_mark:

cadene-resnext101_1 1 382.84 382.70 0.04% :white_check_mark:

onnx-taau-downsample 1 345.53 345.15 0.11% :white_check_mark:

dlrm-criteoterabyte 1 33.35 33.31 0.11% :white_check_mark:

dlrm-criteoterabyte_fp16 1 52.51 52.72 -0.40% :white_check_mark:

agentmodel 1 8,059.84 8,463.32 -4.77% :red_circle:

unet_fp16 2 58.74 58.77 -0.05% :white_check_mark:

resnet50v1_fp16 1 935.52 934.37 0.12% :white_check_mark:

resnet50v1_int8 1 995.64 1,030.33 -3.37% :red_circle:

bert_base_cased_fp16 64 1,169.76 1,170.28 -0.04% :white_check_mark:

bert_large_uncased_fp16 32 363.68 363.03 0.18% :white_check_mark:

bert_large_fp16 1 199.12 199.90 -0.39% :white_check_mark:

distilgpt2_fp16 16 2,197.04 2,197.64 -0.03% :white_check_mark:

yolov5s 1 528.96 518.78 1.96% :white_check_mark:

tinyllama 1 43.34 43.68 -0.79% :white_check_mark:

vicuna-fastchat 1 165.26 173.15 -4.56% :red_circle:

whisper-tiny-encoder 1 418.07 417.60 0.11% :white_check_mark:

whisper-tiny-decoder 1 432.36 429.03 0.78% :white_check_mark:

Test	Batch	Rate new b49963	Rate old 4b15b6	Diff	Compare
torchvision-resnet50	64	3,254.17	3,257.89	-0.11%	:white_check_mark:
torchvision-resnet50_fp16	64	6,988.94	6,918.72	1.01%	:white_check_mark:
torchvision-densenet121	32	2,432.26	2,432.91	-0.03%	:white_check_mark:
torchvision-densenet121_fp16	32	4,085.65	4,086.33	-0.02%	:white_check_mark:
torchvision-inceptionv3	32	1,627.76	1,628.71	-0.06%	:white_check_mark:
torchvision-inceptionv3_fp16	32	2,742.43	2,745.51	-0.11%	:white_check_mark:
cadene-inceptionv4	16	764.96	764.62	0.04%	:white_check_mark:
cadene-resnext64x4	16	810.58	814.13	-0.44%	:white_check_mark:
slim-mobilenet	64	7,461.51	7,458.12	0.05%	:white_check_mark:
slim-nasnetalarge	64	208.46	209.03	-0.27%	:white_check_mark:
slim-resnet50v2	64	3,440.03	3,436.59	0.10%	:white_check_mark:
bert-mrpc-onnx	8	1,150.36	1,150.32	0.00%	:white_check_mark:
bert-mrpc-tf	1	475.69	449.82	5.75%	:high_brightness:
pytorch-examples-wlang-gru	1	428.15	437.44	-2.12%	:white_check_mark:
pytorch-examples-wlang-lstm	1	436.50	383.25	13.89%	:high_brightness:
torchvision-resnet50_1	1	777.19	741.45	4.82%	:high_brightness:
cadene-dpn92_1	1	399.33	400.49	-0.29%	:white_check_mark:
cadene-resnext101_1	1	382.84	382.70	0.04%	:white_check_mark:
onnx-taau-downsample	1	345.53	345.15	0.11%	:white_check_mark:
dlrm-criteoterabyte	1	33.35	33.31	0.11%	:white_check_mark:
dlrm-criteoterabyte_fp16	1	52.51	52.72	-0.40%	:white_check_mark:
agentmodel	1	8,059.84	8,463.32	-4.77%	:red_circle:
unet_fp16	2	58.74	58.77	-0.05%	:white_check_mark:
resnet50v1_fp16	1	935.52	934.37	0.12%	:white_check_mark:
resnet50v1_int8	1	995.64	1,030.33	-3.37%	:red_circle:
bert_base_cased_fp16	64	1,169.76	1,170.28	-0.04%	:white_check_mark:
bert_large_uncased_fp16	32	363.68	363.03	0.18%	:white_check_mark:
bert_large_fp16	1	199.12	199.90	-0.39%	:white_check_mark:
distilgpt2_fp16	16	2,197.04	2,197.64	-0.03%	:white_check_mark:
yolov5s	1	528.96	518.78	1.96%	:white_check_mark:
tinyllama	1	43.34	43.68	-0.79%	:white_check_mark:
vicuna-fastchat	1	165.26	173.15	-4.56%	:red_circle:
whisper-tiny-encoder	1	418.07	417.60	0.11%	:white_check_mark:
whisper-tiny-decoder	1	432.36	429.03	0.78%	:white_check_mark:

This build is not recommended to merge :red_circle:

Dec 06 '24 18:12 migraphx-bot

:white_check_mark: bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

:white_check_mark: bert-mrpc-tf: PASSED: MIGraphX meets tolerance

:white_check_mark: pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

:white_check_mark: pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

:white_check_mark: torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

:white_check_mark: cadene-dpn92_1: PASSED: MIGraphX meets tolerance

:white_check_mark: cadene-resnext101_1: PASSED: MIGraphX meets tolerance

:white_check_mark: dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

:white_check_mark: agentmodel: PASSED: MIGraphX meets tolerance

:white_check_mark: unet: PASSED: MIGraphX meets tolerance

:white_check_mark: resnet50v1: PASSED: MIGraphX meets tolerance

:white_check_mark: bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

:red_circle:bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

:white_check_mark: bert_large: PASSED: MIGraphX meets tolerance

:white_check_mark: yolov5s: PASSED: MIGraphX meets tolerance

:white_check_mark: tinyllama: PASSED: MIGraphX meets tolerance

:white_check_mark: vicuna-fastchat: PASSED: MIGraphX meets tolerance

:white_check_mark: whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

:white_check_mark: whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

:white_check_mark: distilgpt2_fp16: PASSED: MIGraphX meets tolerance

Dec 06 '24 18:12 migraphx-bot

/AzurePipelines run

Mar 07 '25 22:03 jayhawk-commits

Azure Pipelines successfully started running 1 pipeline(s).

Mar 07 '25 22:03 azure-pipelines[bot]