AMDMIGraphX Add propagate

Mar 02 '24 00:03 pfultz2

Codecov Report

Attention: Patch coverage is 94.11765% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 91.76%. Comparing base (84fc9f0) to head (da8471d).

Files	Patch %	Lines
src/propagate_precision.cpp	94.04%	5 Missing :warning:

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #2853      +/-   ##
===========================================
+ Coverage    91.75%   91.76%   +0.01%     
===========================================
  Files          473      475       +2     
  Lines        17958    18043      +85     
===========================================
+ Hits         16478    16558      +80     
- Misses        1480     1485       +5

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Mar 02 '24 01:03 codecov[bot]

Test Batch Rate new
0f785b Rate old
5ba023 Diff Compare

torchvision-resnet50 64 2,826.08 2,824.95 0.04% :white_check_mark:

torchvision-resnet50_fp16 64 6,579.77 6,575.26 0.07% :white_check_mark:

torchvision-densenet121 32 2,105.76 2,101.96 0.18% :white_check_mark:

torchvision-densenet121_fp16 32 3,696.98 3,683.24 0.37% :white_check_mark:

torchvision-inceptionv3 32 1,602.40 1,606.31 -0.24% :white_check_mark:

torchvision-inceptionv3_fp16 32 2,551.04 2,555.93 -0.19% :white_check_mark:

cadene-inceptionv4 16 717.22 717.78 -0.08% :white_check_mark:

cadene-resnext64x4 16 680.55 680.75 -0.03% :white_check_mark:

slim-mobilenet 64 5,900.63 5,910.66 -0.17% :white_check_mark:

slim-nasnetalarge 64 153.92 153.88 0.03% :white_check_mark:

slim-resnet50v2 64 2,592.86 2,590.48 0.09% :white_check_mark:

bert-mrpc-onnx 8 920.94 960.32 -4.10% :red_circle:

bert-mrpc-tf 1 399.57 400.86 -0.32% :white_check_mark:

pytorch-examples-wlang-gru 1 392.57 394.04 -0.37% :white_check_mark:

pytorch-examples-wlang-lstm 1 368.63 366.42 0.60% :white_check_mark:

torchvision-resnet50_1 1 603.29 606.33 -0.50% :white_check_mark:

cadene-dpn92_1 1 389.75 393.42 -0.93% :white_check_mark:

cadene-resnext101_1 1 331.98 332.05 -0.02% :white_check_mark:

onnx-taau-downsample 1 307.25 307.56 -0.10% :white_check_mark:

dlrm-criteoterabyte 1 28.79 28.80 -0.01% :white_check_mark:

dlrm-criteoterabyte_fp16 1 48.40 48.29 0.24% :white_check_mark:

agentmodel 1 7,243.91 7,346.89 -1.40% :white_check_mark:

unet_fp16 2 57.79 57.56 0.39% :white_check_mark:

resnet50v1_fp16 1 910.81 917.25 -0.70% :white_check_mark:

resnet50v1_int8 1 794.12 815.05 -2.57% :white_check_mark:

bert_base_cased_fp16 64 1,053.46 1,053.37 0.01% :white_check_mark:

bert_large_uncased_fp16 32 301.66 301.70 -0.02% :white_check_mark:

bert_large_fp16 1 158.70 158.88 -0.12% :white_check_mark:

distilgpt2_fp16 16 1,858.05 1,860.49 -0.13% :white_check_mark:

yolov5s 1 475.81 481.01 -1.08% :white_check_mark:

tinyllama 1 32.99 33.01 -0.06% :white_check_mark:

vicuna-fastchat 1 157.29 159.19 -1.19% :white_check_mark:

whisper-tiny-encoder 1 348.02 347.33 0.20% :white_check_mark:

whisper-tiny-decoder 1 395.52 396.69 -0.30% :white_check_mark:

Test	Batch	Rate new 0f785b	Rate old 5ba023	Diff	Compare
torchvision-resnet50	64	2,826.08	2,824.95	0.04%	:white_check_mark:
torchvision-resnet50_fp16	64	6,579.77	6,575.26	0.07%	:white_check_mark:
torchvision-densenet121	32	2,105.76	2,101.96	0.18%	:white_check_mark:
torchvision-densenet121_fp16	32	3,696.98	3,683.24	0.37%	:white_check_mark:
torchvision-inceptionv3	32	1,602.40	1,606.31	-0.24%	:white_check_mark:
torchvision-inceptionv3_fp16	32	2,551.04	2,555.93	-0.19%	:white_check_mark:
cadene-inceptionv4	16	717.22	717.78	-0.08%	:white_check_mark:
cadene-resnext64x4	16	680.55	680.75	-0.03%	:white_check_mark:
slim-mobilenet	64	5,900.63	5,910.66	-0.17%	:white_check_mark:
slim-nasnetalarge	64	153.92	153.88	0.03%	:white_check_mark:
slim-resnet50v2	64	2,592.86	2,590.48	0.09%	:white_check_mark:
bert-mrpc-onnx	8	920.94	960.32	-4.10%	:red_circle:
bert-mrpc-tf	1	399.57	400.86	-0.32%	:white_check_mark:
pytorch-examples-wlang-gru	1	392.57	394.04	-0.37%	:white_check_mark:
pytorch-examples-wlang-lstm	1	368.63	366.42	0.60%	:white_check_mark:
torchvision-resnet50_1	1	603.29	606.33	-0.50%	:white_check_mark:
cadene-dpn92_1	1	389.75	393.42	-0.93%	:white_check_mark:
cadene-resnext101_1	1	331.98	332.05	-0.02%	:white_check_mark:
onnx-taau-downsample	1	307.25	307.56	-0.10%	:white_check_mark:
dlrm-criteoterabyte	1	28.79	28.80	-0.01%	:white_check_mark:
dlrm-criteoterabyte_fp16	1	48.40	48.29	0.24%	:white_check_mark:
agentmodel	1	7,243.91	7,346.89	-1.40%	:white_check_mark:
unet_fp16	2	57.79	57.56	0.39%	:white_check_mark:
resnet50v1_fp16	1	910.81	917.25	-0.70%	:white_check_mark:
resnet50v1_int8	1	794.12	815.05	-2.57%	:white_check_mark:
bert_base_cased_fp16	64	1,053.46	1,053.37	0.01%	:white_check_mark:
bert_large_uncased_fp16	32	301.66	301.70	-0.02%	:white_check_mark:
bert_large_fp16	1	158.70	158.88	-0.12%	:white_check_mark:
distilgpt2_fp16	16	1,858.05	1,860.49	-0.13%	:white_check_mark:
yolov5s	1	475.81	481.01	-1.08%	:white_check_mark:
tinyllama	1	32.99	33.01	-0.06%	:white_check_mark:
vicuna-fastchat	1	157.29	159.19	-1.19%	:white_check_mark:
whisper-tiny-encoder	1	348.02	347.33	0.20%	:white_check_mark:
whisper-tiny-decoder	1	395.52	396.69	-0.30%	:white_check_mark:

This build is not recommended to merge :red_circle:

Mar 02 '24 01:03 migraphx-bot

:white_check_mark: bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

:white_check_mark: bert-mrpc-tf: PASSED: MIGraphX meets tolerance

:white_check_mark: pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

:white_check_mark: pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

:white_check_mark: torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

:white_check_mark: cadene-dpn92_1: PASSED: MIGraphX meets tolerance

:white_check_mark: cadene-resnext101_1: PASSED: MIGraphX meets tolerance

:white_check_mark: dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

:white_check_mark: agentmodel: PASSED: MIGraphX meets tolerance

:white_check_mark: unet: PASSED: MIGraphX meets tolerance

:white_check_mark: resnet50v1: PASSED: MIGraphX meets tolerance

:white_check_mark: bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

:red_circle:bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

:white_check_mark: bert_large: PASSED: MIGraphX meets tolerance

:white_check_mark: yolov5s: PASSED: MIGraphX meets tolerance

:white_check_mark: tinyllama: PASSED: MIGraphX meets tolerance

:white_check_mark: vicuna-fastchat: PASSED: MIGraphX meets tolerance

:white_check_mark: whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

:white_check_mark: whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

:white_check_mark: distilgpt2_fp16: PASSED: MIGraphX meets tolerance

Mar 02 '24 01:03 migraphx-bot

For some background: where are we failing accuracy because of precision changes?

This is related to the fp16 inaccuracy with llamav2(see #2556). #2883 will use FP32 for large reduce_means, but it still isnt enough to get accurate results(or avoid nans). So this will use FP32 for the x^2/n on the input and it will use FP32 for the rsqrt(mean + epsilon) that follows the reduce_mean.

Apr 04 '24 01:04 pfultz2

AMDMIGraphX
AMDMIGraphX copied to clipboard

Add propagate_precision pass

Codecov Report

AMDMIGraphX AMDMIGraphX copied to clipboard

Add propagate_precision pass

Codecov Report

AMDMIGraphX
AMDMIGraphX copied to clipboard