MIOpen
MIOpen copied to clipboard
Implement ResourcApplyGradientDescent
-
Added ResourceApplyGradientDescent.
-
Added driver test and gtest for ResourceApplyGradientDescent.
-
New API is guarded by MIOPEN_BETA_API macro.
-
Average over all cases:
-
ResourceApplyGradientDescent
| Type | Forward |
|---|---|
| float16 | 2.08 |
| float32 | 1.92 |
| bfloat16 | 2.10 |
FP16
| op_name | dtype | input_size | direction | rocm_kernel_avg | kernel_duration | MIOpen over rocm |
|---|---|---|---|---|---|---|
| ResourceApplyGradientDescent | float16 | [50 100] | fwd | 11776 | 4728 | 2.490693739 |
| ResourceApplyGradientDescent | float16 | [100 50] | fwd | 11280 | 4889 | 2.30722029 |
| ResourceApplyGradientDescent | float16 | [100 100] | fwd | 11408 | 4711 | 2.421566546 |
| ResourceApplyGradientDescent | float16 | [100 300] | fwd | 11504 | 4871 | 2.361732704 |
| ResourceApplyGradientDescent | float16 | [300 100] | fwd | 11584 | 4640 | 2.496551724 |
| ResourceApplyGradientDescent | float16 | [200 300] | fwd | 12016 | 4800 | 2.503333333 |
| ResourceApplyGradientDescent | float16 | [205 350] | fwd | 11856 | 5031 | 2.356589147 |
| ResourceApplyGradientDescent | float16 | [350 105] | fwd | 11488 | 4871 | 2.358447957 |
| ResourceApplyGradientDescent | float16 | [405 200] | fwd | 11936 | 5155 | 2.31542192 |
| ResourceApplyGradientDescent | float16 | [10 10 10] | fwd | 10176 | 3928 | 2.590631365 |
| ResourceApplyGradientDescent | float16 | [10 10 30] | fwd | 10944 | 4835 | 2.263495346 |
| ResourceApplyGradientDescent | float16 | [10 30 10] | fwd | 10976 | 4729 | 2.320998097 |
| ResourceApplyGradientDescent | float16 | [30 10 10] | fwd | 11600 | 4711 | 2.462322225 |
| ResourceApplyGradientDescent | float16 | [30 30 30] | fwd | 11552 | 4746 | 2.434049726 |
| ResourceApplyGradientDescent | float16 | [50 100 50] | fwd | 14416 | 6720 | 2.145238095 |
| ResourceApplyGradientDescent | float16 | [100 50 100] | fwd | 14768 | 8266 | 1.786595693 |
| ResourceApplyGradientDescent | float16 | [100 100 100] | fwd | 20576 | 11626 | 1.769826252 |
| ResourceApplyGradientDescent | float16 | [100 100 300] | fwd | 38336 | 25138 | 1.525021879 |
| ResourceApplyGradientDescent | float16 | [300 100 100] | fwd | 37936 | 25049 | 1.514471636 |
| ResourceApplyGradientDescent | float16 | [10 10 10 10] | fwd | 11152 | 4906 | 2.273134937 |
| ResourceApplyGradientDescent | float16 | [10 10 10 30] | fwd | 11584 | 4818 | 2.404317144 |
| ResourceApplyGradientDescent | float16 | [30 10 10 10] | fwd | 11568 | 4871 | 2.37487169 |
| ResourceApplyGradientDescent | float16 | [30 30 30 30] | fwd | 16432 | 10382 | 1.582739357 |
| ResourceApplyGradientDescent | float16 | [50 100 50 100] | fwd | 239327 | 172503 | 1.38737877 |
| ResourceApplyGradientDescent | float16 | [100 50 100 50] | fwd | 239439 | 172521 | 1.387883214 |
| ResourceApplyGradientDescent | float16 | [100 100 100 100] | fwd | 979692 | 678811 | 1.443247089 |
| ResourceApplyGradientDescent | float16 | [100 100 300 100] | fwd | 2860371 | 2021410 | 1.415037523 |
| ResourceApplyGradientDescent | float16 | [300 100 100 100] | fwd | 2864580 | 2023310 | 1.415788979 |
FP32
| op_name | dtype | input_size | direction | rocm_kernel_avg | kernel_duration | MIOpen over rocm |
|---|---|---|---|---|---|---|
| ResourceApplyGradientDescent | float32 | [50 100] | fwd | 9984 | 4764 | 2.095717884 |
| ResourceApplyGradientDescent | float32 | [100 50] | fwd | 9856 | 4711 | 2.092124814 |
| ResourceApplyGradientDescent | float32 | [100 100] | fwd | 10048 | 4729 | 2.124762106 |
| ResourceApplyGradientDescent | float32 | [100 300] | fwd | 10352 | 4942 | 2.094698503 |
| ResourceApplyGradientDescent | float32 | [300 100] | fwd | 10624 | 4871 | 2.181071649 |
| ResourceApplyGradientDescent | float32 | [200 300] | fwd | 11040 | 5173 | 2.134158129 |
| ResourceApplyGradientDescent | float32 | [205 350] | fwd | 11248 | 5155 | 2.181959263 |
| ResourceApplyGradientDescent | float32 | [350 105] | fwd | 11008 | 4871 | 2.259905564 |
| ResourceApplyGradientDescent | float32 | [405 200] | fwd | 11120 | 5386 | 2.064611957 |
| ResourceApplyGradientDescent | float32 | [10 10 10] | fwd | 8512 | 4889 | 1.74105134 |
| ResourceApplyGradientDescent | float32 | [10 10 30] | fwd | 9936 | 4746 | 2.093552465 |
| ResourceApplyGradientDescent | float32 | [10 30 10] | fwd | 10608 | 4782 | 2.218318695 |
| ResourceApplyGradientDescent | float32 | [30 10 10] | fwd | 10192 | 4711 | 2.163447251 |
| ResourceApplyGradientDescent | float32 | [30 30 30] | fwd | 10512 | 4942 | 2.127074059 |
| ResourceApplyGradientDescent | float32 | [50 100 50] | fwd | 13488 | 7484 | 1.802244789 |
| ResourceApplyGradientDescent | float32 | [100 50 100] | fwd | 14768 | 9280 | 1.59137931 |
| ResourceApplyGradientDescent | float32 | [100 100 100] | fwd | 24544 | 14987 | 1.637685995 |
| ResourceApplyGradientDescent | float32 | [100 100 300] | fwd | 52736 | 32658 | 1.614795762 |
| ResourceApplyGradientDescent | float32 | [300 100 100] | fwd | 52736 | 32338 | 1.63077494 |
| ResourceApplyGradientDescent | float32 | [10 10 10 10] | fwd | 10192 | 4728 | 2.155668359 |
| ResourceApplyGradientDescent | float32 | [10 10 10 30] | fwd | 10816 | 4924 | 2.19658814 |
| ResourceApplyGradientDescent | float32 | [30 10 10 10] | fwd | 10528 | 4924 | 2.138099106 |
| ResourceApplyGradientDescent | float32 | [30 30 30 30] | fwd | 21472 | 13191 | 1.627776514 |
| ResourceApplyGradientDescent | float32 | [50 100 50 100] | fwd | 375838 | 243829 | 1.541399916 |
| ResourceApplyGradientDescent | float32 | [100 50 100 50] | fwd | 376158 | 243794 | 1.542933788 |
| ResourceApplyGradientDescent | float32 | [100 100 100 100] | fwd | 1493306 | 945557 | 1.579287129 |
| ResourceApplyGradientDescent | float32 | [100 100 300 100] | fwd | 4483885 | 2815180 | 1.592752506 |
| ResourceApplyGradientDescent | float32 | [300 100 100 100] | fwd | 4464925 | 2816790 | 1.585111066 |
BFP16
| op_name | dtype | input_size | direction | rocm_kernel_avg | kernel_duration | MIOpen over rocm |
|---|---|---|---|---|---|---|
| ResourceApplyGradientDescent | bfloat16 | [50 100] | fwd | 15696 | 4818 | 3.257783313 |
| ResourceApplyGradientDescent | bfloat16 | [100 50] | fwd | 14480 | 4764 | 3.039462636 |
| ResourceApplyGradientDescent | bfloat16 | [100 100] | fwd | 13264 | 4835 | 2.743329886 |
| ResourceApplyGradientDescent | bfloat16 | [100 300] | fwd | 10624 | 4853 | 2.189161343 |
| ResourceApplyGradientDescent | bfloat16 | [300 100] | fwd | 10704 | 4764 | 2.246851385 |
| ResourceApplyGradientDescent | bfloat16 | [200 300] | fwd | 11056 | 4942 | 2.237150951 |
| ResourceApplyGradientDescent | bfloat16 | [205 350] | fwd | 9952 | 4924 | 2.02112104 |
| ResourceApplyGradientDescent | bfloat16 | [350 105] | fwd | 13104 | 4835 | 2.710237849 |
| ResourceApplyGradientDescent | bfloat16 | [405 200] | fwd | 10368 | 5262 | 1.970353478 |
| ResourceApplyGradientDescent | bfloat16 | [10 10 10] | fwd | 14768 | 3857 | 3.828882551 |
| ResourceApplyGradientDescent | bfloat16 | [10 10 30] | fwd | 13600 | 4711 | 2.886860539 |
| ResourceApplyGradientDescent | bfloat16 | [10 30 10] | fwd | 13360 | 4657 | 2.868799656 |
| ResourceApplyGradientDescent | bfloat16 | [30 10 10] | fwd | 13344 | 4711 | 2.832519635 |
| ResourceApplyGradientDescent | bfloat16 | [30 30 30] | fwd | 10512 | 4853 | 2.166082835 |
| ResourceApplyGradientDescent | bfloat16 | [50 100 50] | fwd | 10576 | 6755 | 1.56565507 |
| ResourceApplyGradientDescent | bfloat16 | [100 50 100] | fwd | 12448 | 8302 | 1.499397735 |
| ResourceApplyGradientDescent | bfloat16 | [100 100 100] | fwd | 19936 | 11893 | 1.676280165 |
| ResourceApplyGradientDescent | bfloat16 | [100 100 300] | fwd | 32688 | 25991 | 1.257666115 |
| ResourceApplyGradientDescent | bfloat16 | [300 100 100] | fwd | 33920 | 25760 | 1.316770186 |
| ResourceApplyGradientDescent | bfloat16 | [10 10 10 10] | fwd | 14752 | 4818 | 3.061851391 |
| ResourceApplyGradientDescent | bfloat16 | [10 10 10 30] | fwd | 11792 | 4835 | 2.438883144 |
| ResourceApplyGradientDescent | bfloat16 | [30 10 10 10] | fwd | 10528 | 4782 | 2.201589293 |
| ResourceApplyGradientDescent | bfloat16 | [30 30 30 30] | fwd | 14400 | 10417 | 1.382355765 |
| ResourceApplyGradientDescent | bfloat16 | [50 100 50 100] | fwd | 199342 | 180130 | 1.106656304 |
| ResourceApplyGradientDescent | bfloat16 | [100 50 100 50] | fwd | 197758 | 180183 | 1.097539724 |
| ResourceApplyGradientDescent | bfloat16 | [100 100 100 100] | fwd | 761881 | 721869 | 1.055428339 |
| ResourceApplyGradientDescent | bfloat16 | [100 100 300 100] | fwd | 2282092 | 2114300 | 1.079360545 |
| ResourceApplyGradientDescent | bfloat16 | [300 100 100 100] | fwd | 2282588 | 2123540 | 1.074897577 |