AMDMIGraphX icon indicating copy to clipboard operation
AMDMIGraphX copied to clipboard

Preload tiles into LDS to improve performance of pointwise transposes

Open pfultz2 opened this issue 1 year ago • 6 comments

Fixes #3172.

pfultz2 avatar Aug 12 '24 15:08 pfultz2

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.02%. Comparing base (e230c02) to head (289067b). Report is 161 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3362      +/-   ##
===========================================
- Coverage    92.04%   92.02%   -0.03%     
===========================================
  Files          506      509       +3     
  Lines        20872    21005     +133     
===========================================
+ Hits         19212    19330     +118     
- Misses        1660     1675      +15     
Flag Coverage Δ
92.02% <100.00%> (-0.03%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Aug 18 '24 00:08 codecov[bot]

@kahmed10 there's some other failures in CI with this one.

TedThemistokleous avatar Aug 28 '24 21:08 TedThemistokleous

Seems to be failing test_spacetodepth_example_cpu from ONNX backend. The test says CPU but the compiled program looks to be using the GPU...

CharlieL7 avatar Aug 29 '24 15:08 CharlieL7

@pfultz2 Looks like in CI its still failing but on pooling for the all targets build in jenkins.

] 354/382 Test #365: test_api_custom_op_gpu ....................................................***Exception: SegFault  1.84 sec

Is this related to your remove padding from pooling PR #3423 ? Looks like some failures themselves with pooling

TedThemistokleous avatar Sep 07 '24 03:09 TedThemistokleous

Test Batch Rate new
4a20e6
Rate old
1ab830
Diff Compare
torchvision-resnet50 64 3,249.19 3,257.25 -0.25% :white_check_mark:
torchvision-resnet50_fp16 64 6,991.94 6,998.54 -0.09% :white_check_mark:
torchvision-densenet121 32 2,431.98 2,431.43 0.02% :white_check_mark:
torchvision-densenet121_fp16 32 4,097.00 4,095.87 0.03% :white_check_mark:
torchvision-inceptionv3 32 1,638.50 1,637.26 0.08% :white_check_mark:
torchvision-inceptionv3_fp16 32 2,744.19 2,742.87 0.05% :white_check_mark:
cadene-inceptionv4 16 779.12 779.29 -0.02% :white_check_mark:
cadene-resnext64x4 16 808.38 807.81 0.07% :white_check_mark:
slim-mobilenet 64 7,456.72 7,455.44 0.02% :white_check_mark:
slim-nasnetalarge 64 208.18 208.13 0.02% :white_check_mark:
slim-resnet50v2 64 3,435.43 3,441.14 -0.17% :white_check_mark:
bert-mrpc-onnx 8 1,148.21 1,155.05 -0.59% :white_check_mark:
bert-mrpc-tf 1 306.87 317.57 -3.37% :red_circle:
pytorch-examples-wlang-gru 1 421.54 386.94 8.94% :high_brightness:
pytorch-examples-wlang-lstm 1 379.34 381.97 -0.69% :white_check_mark:
torchvision-resnet50_1 1 772.28 801.56 -3.65% :red_circle:
cadene-dpn92_1 1 437.16 400.33 9.20% :high_brightness:
cadene-resnext101_1 1 383.35 383.10 0.07% :white_check_mark:
onnx-taau-downsample 1 366.49 343.39 6.72% :high_brightness:
dlrm-criteoterabyte 1 35.05 35.03 0.04% :white_check_mark:
dlrm-criteoterabyte_fp16 1 58.08 58.13 -0.09% :white_check_mark:
agentmodel 1 8,111.35 8,052.83 0.73% :white_check_mark:
unet_fp16 2 58.89 57.80 1.90% :white_check_mark:
resnet50v1_fp16 1 927.80 939.96 -1.29% :white_check_mark:
resnet50v1_int8 1 947.01 969.54 -2.32% :white_check_mark:
bert_base_cased_fp16 64 1,153.59 1,172.44 -1.61% :white_check_mark:
bert_large_uncased_fp16 32 355.73 362.82 -1.95% :white_check_mark:
bert_large_fp16 1 210.28 214.00 -1.74% :white_check_mark:
distilgpt2_fp16 16 2,162.07 2,204.43 -1.92% :white_check_mark:
yolov5s 1 546.47 533.64 2.41% :white_check_mark:
tinyllama 1 43.39 43.45 -0.15% :white_check_mark:
vicuna-fastchat 1 169.47 168.60 0.51% :white_check_mark:
whisper-tiny-encoder 1 417.94 417.95 -0.00% :white_check_mark:
whisper-tiny-decoder 1 435.99 426.15 2.31% :white_check_mark:

This build is not recommended to merge :red_circle:

migraphx-bot avatar Sep 29 '24 19:09 migraphx-bot


     :white_check_mark: bert-mrpc-onnx: PASSED: MIGraphX meets tolerance
     :white_check_mark: bert-mrpc-tf: PASSED: MIGraphX meets tolerance
     :white_check_mark: pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance
     :white_check_mark: pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance
     :white_check_mark: torchvision-resnet50_1: PASSED: MIGraphX meets tolerance
     :white_check_mark: cadene-dpn92_1: PASSED: MIGraphX meets tolerance
     :white_check_mark: cadene-resnext101_1: PASSED: MIGraphX meets tolerance
     :white_check_mark: dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance
     :white_check_mark: agentmodel: PASSED: MIGraphX meets tolerance
     :white_check_mark: unet: PASSED: MIGraphX meets tolerance
     :white_check_mark: resnet50v1: PASSED: MIGraphX meets tolerance
     :white_check_mark: bert_base_cased_fp16: PASSED: MIGraphX meets tolerance
:red_circle:bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

     :white_check_mark: bert_large: PASSED: MIGraphX meets tolerance
     :white_check_mark: yolov5s: PASSED: MIGraphX meets tolerance
     :white_check_mark: tinyllama: PASSED: MIGraphX meets tolerance
     :white_check_mark: vicuna-fastchat: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-tiny-encoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-tiny-decoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: distilgpt2_fp16: PASSED: MIGraphX meets tolerance

migraphx-bot avatar Sep 29 '24 19:09 migraphx-bot