MIOpen
MIOpen copied to clipboard
Implement ResourceApplyKerasMomentum
-
Added ResourceApplyKerasMomentum.
-
Added driver test and gtest for ResourceApplyKerasMomentum.
-
New API is guarded by MIOPEN_BETA_API macro.
-
Average over all cases:
-
ResourceApplyKerasMomentum
| Type | Forward |
|---|---|
| float16 | 5.10 |
| float32 | 4.44 |
| bfloat16 | 4.67 |
FP16
| op_name | dtype | input_size | use_nesterov | rocm_kernel_avg | kernel_duration | MIOpen over rocm |
|---|---|---|---|---|---|---|
| ResourceApplyKerasMomentum | float16 | [50 100] | TRUE | 40928 | 5777 | 7.08464601 |
| ResourceApplyKerasMomentum | float16 | [100 50] | TRUE | 40768 | 5671 | 7.188855581 |
| ResourceApplyKerasMomentum | float16 | [100 100] | TRUE | 40960 | 5706 | 7.178408693 |
| ResourceApplyKerasMomentum | float16 | [100 300] | TRUE | 41472 | 5973 | 6.943244601 |
| ResourceApplyKerasMomentum | float16 | [300 100] | TRUE | 41615 | 5813 | 7.158954068 |
| ResourceApplyKerasMomentum | float16 | [200 300] | TRUE | 43280 | 6222 | 6.955962713 |
| ResourceApplyKerasMomentum | float16 | [205 350] | TRUE | 42575 | 6328 | 6.728034134 |
| ResourceApplyKerasMomentum | float16 | [350 105] | TRUE | 42111 | 6008 | 7.009154461 |
| ResourceApplyKerasMomentum | float16 | [405 200] | TRUE | 43616 | 6328 | 6.892541087 |
| ResourceApplyKerasMomentum | float16 | [10 10 10] | TRUE | 35808 | 4906 | 7.298817774 |
| ResourceApplyKerasMomentum | float16 | [10 10 30] | TRUE | 39152 | 5706 | 6.861549246 |
| ResourceApplyKerasMomentum | float16 | [10 30 10] | TRUE | 38912 | 5724 | 6.798043326 |
| ResourceApplyKerasMomentum | float16 | [30 10 10] | TRUE | 38256 | 5777 | 6.622122209 |
| ResourceApplyKerasMomentum | float16 | [30 30 30] | TRUE | 41135 | 5884 | 6.990992522 |
| ResourceApplyKerasMomentum | float16 | [50 100 50] | TRUE | 48688 | 8284 | 5.877353935 |
| ResourceApplyKerasMomentum | float16 | [100 50 100] | TRUE | 50607 | 10150 | 4.98591133 |
| ResourceApplyKerasMomentum | float16 | [100 100 100] | TRUE | 66336 | 15466 | 4.289150394 |
| ResourceApplyKerasMomentum | float16 | [10 10 10 10] | TRUE | 39327 | 5795 | 6.786367558 |
| ResourceApplyKerasMomentum | float16 | [10 10 10 30] | TRUE | 39600 | 5937 | 6.670035371 |
| ResourceApplyKerasMomentum | float16 | [30 10 10 10] | TRUE | 40095 | 5991 | 6.692538808 |
| ResourceApplyKerasMomentum | float16 | [30 30 30 30] | TRUE | 53647 | 13422 | 3.996945314 |
| ResourceApplyKerasMomentum | float16 | [50 100 50 100] | TRUE | 798918 | 264049 | 3.025642968 |
| ResourceApplyKerasMomentum | float16 | [100 50 100 50] | TRUE | 796807 | 263426 | 3.024784949 |
| ResourceApplyKerasMomentum | float16 | [100 100 100 100] | TRUE | 3178683 | 997281 | 3.187349403 |
| ResourceApplyKerasMomentum | float16 | [10 10 10 10 10] | TRUE | 40767 | 6471 | 6.299953639 |
| ResourceApplyKerasMomentum | float16 | [10 10 10 30 30] | TRUE | 63535 | 14293 | 4.445182957 |
| ResourceApplyKerasMomentum | float16 | [30 10 10 10 10] | TRUE | 47135 | 8568 | 5.501283847 |
| ResourceApplyKerasMomentum | float16 | [10 20 15 20 20] | TRUE | 69279 | 17510 | 3.956539121 |
| ResourceApplyKerasMomentum | float16 | [50 100] | FALSE | 28288 | 5759 | 4.911963883 |
| ResourceApplyKerasMomentum | float16 | [100 50] | FALSE | 27504 | 5742 | 4.789968652 |
| ResourceApplyKerasMomentum | float16 | [100 100] | FALSE | 27280 | 5848 | 4.664842681 |
| ResourceApplyKerasMomentum | float16 | [100 300] | FALSE | 28112 | 5884 | 4.777702243 |
| ResourceApplyKerasMomentum | float16 | [300 100] | FALSE | 28000 | 5848 | 4.787961696 |
| ResourceApplyKerasMomentum | float16 | [200 300] | FALSE | 28864 | 6133 | 4.706342736 |
| ResourceApplyKerasMomentum | float16 | [205 350] | FALSE | 28720 | 6151 | 4.669159486 |
| ResourceApplyKerasMomentum | float16 | [350 105] | FALSE | 28240 | 5884 | 4.799456152 |
| ResourceApplyKerasMomentum | float16 | [405 200] | FALSE | 29136 | 6115 | 4.764677024 |
| ResourceApplyKerasMomentum | float16 | [10 10 10] | FALSE | 25424 | 4800 | 5.296666667 |
| ResourceApplyKerasMomentum | float16 | [10 10 30] | FALSE | 28112 | 5671 | 4.957150414 |
| ResourceApplyKerasMomentum | float16 | [10 30 10] | FALSE | 27472 | 5670 | 4.845149912 |
| ResourceApplyKerasMomentum | float16 | [30 10 10] | FALSE | 27328 | 5653 | 4.834247302 |
| ResourceApplyKerasMomentum | float16 | [30 30 30] | FALSE | 28656 | 5848 | 4.900136799 |
| ResourceApplyKerasMomentum | float16 | [50 100 50] | FALSE | 34112 | 8177 | 4.171701113 |
| ResourceApplyKerasMomentum | float16 | [100 50 100] | FALSE | 35808 | 10061 | 3.559089554 |
| ResourceApplyKerasMomentum | float16 | [100 100 100] | FALSE | 48207 | 15359 | 3.138680904 |
| ResourceApplyKerasMomentum | float16 | [10 10 10 10] | FALSE | 28112 | 5919 | 4.749450921 |
| ResourceApplyKerasMomentum | float16 | [10 10 10 30] | FALSE | 28768 | 6079 | 4.732357296 |
| ResourceApplyKerasMomentum | float16 | [30 10 10 10] | FALSE | 28592 | 5919 | 4.8305457 |
| ResourceApplyKerasMomentum | float16 | [30 30 30 30] | FALSE | 38815 | 13546 | 2.865421527 |
| ResourceApplyKerasMomentum | float16 | [50 100 50 100] | FALSE | 567642 | 264458 | 2.146435351 |
| ResourceApplyKerasMomentum | float16 | [100 50 100 50] | FALSE | 565209 | 263676 | 2.143573932 |
| ResourceApplyKerasMomentum | float16 | [100 100 100 100] | FALSE | 2284645 | 996004 | 2.293811069 |
| ResourceApplyKerasMomentum | float16 | [10 10 10 10 10] | FALSE | 44400 | 6488 | 6.843403206 |
| ResourceApplyKerasMomentum | float16 | [10 10 10 30 30] | FALSE | 45887 | 14488 | 3.167241855 |
| ResourceApplyKerasMomentum | float16 | [30 10 10 10 10] | FALSE | 33360 | 8728 | 3.822181485 |
| ResourceApplyKerasMomentum | float16 | [10 20 15 20 20] | FALSE | 50447 | 17315 | 2.913485417 |
FP32
| op_name | dtype | input_size | use_nesterov | rocm_kernel_avg | kernel_duration | MIOpen over rocm |
|---|---|---|---|---|---|---|
| ResourceApplyKerasMomentum | float32 | [50 100] | TRUE | 35696 | 5724 | 6.236198463 |
| ResourceApplyKerasMomentum | float32 | [100 50] | TRUE | 35616 | 5902 | 6.034564554 |
| ResourceApplyKerasMomentum | float32 | [100 100] | TRUE | 35807 | 5653 | 6.334158854 |
| ResourceApplyKerasMomentum | float32 | [100 300] | TRUE | 37200 | 6044 | 6.154864328 |
| ResourceApplyKerasMomentum | float32 | [300 100] | TRUE | 37088 | 5991 | 6.190619262 |
| ResourceApplyKerasMomentum | float32 | [200 300] | TRUE | 38175 | 6577 | 5.804318078 |
| ResourceApplyKerasMomentum | float32 | [205 350] | TRUE | 38640 | 6577 | 5.875019006 |
| ResourceApplyKerasMomentum | float32 | [350 105] | TRUE | 37519 | 6097 | 6.153682139 |
| ResourceApplyKerasMomentum | float32 | [405 200] | TRUE | 38672 | 6791 | 5.694595789 |
| ResourceApplyKerasMomentum | float32 | [10 10 10] | TRUE | 29520 | 5813 | 5.078272837 |
| ResourceApplyKerasMomentum | float32 | [10 10 30] | TRUE | 35104 | 5760 | 6.094444444 |
| ResourceApplyKerasMomentum | float32 | [10 30 10] | TRUE | 34736 | 5688 | 6.106891702 |
| ResourceApplyKerasMomentum | float32 | [30 10 10] | TRUE | 35088 | 5724 | 6.129979036 |
| ResourceApplyKerasMomentum | float32 | [30 30 30] | TRUE | 36736 | 5955 | 6.168933669 |
| ResourceApplyKerasMomentum | float32 | [50 100 50] | TRUE | 45567 | 9279 | 4.910766246 |
| ResourceApplyKerasMomentum | float32 | [100 50 100] | TRUE | 53040 | 13155 | 4.031927024 |
| ResourceApplyKerasMomentum | float32 | [100 100 100] | TRUE | 79087 | 21030 | 3.760675226 |
| ResourceApplyKerasMomentum | float32 | [10 10 10 10] | TRUE | 34288 | 5813 | 5.898503355 |
| ResourceApplyKerasMomentum | float32 | [10 10 10 30] | TRUE | 35952 | 6222 | 5.778206365 |
| ResourceApplyKerasMomentum | float32 | [30 10 10 10] | TRUE | 36191 | 6080 | 5.952467105 |
| ResourceApplyKerasMomentum | float32 | [30 30 30 30] | TRUE | 67631 | 18168 | 3.722534126 |
| ResourceApplyKerasMomentum | float32 | [50 100 50 100] | TRUE | 1258593 | 385842 | 3.261938825 |
| ResourceApplyKerasMomentum | float32 | [100 50 100 50] | TRUE | 1261505 | 384882 | 3.277640939 |
| ResourceApplyKerasMomentum | float32 | [100 100 100 100] | TRUE | 5003717 | 1532880 | 3.264258781 |
| ResourceApplyKerasMomentum | float32 | [10 10 10 10 10] | TRUE | 37312 | 6950 | 5.368633094 |
| ResourceApplyKerasMomentum | float32 | [10 10 10 30 30] | TRUE | 71071 | 19715 | 3.604920112 |
| ResourceApplyKerasMomentum | float32 | [30 10 10 10 10] | TRUE | 45824 | 9813 | 4.669723836 |
| ResourceApplyKerasMomentum | float32 | [10 20 15 20 20] | TRUE | 88559 | 23519 | 3.7654237 |
| ResourceApplyKerasMomentum | float32 | [50 100] | FALSE | 24048 | 5813 | 4.136934457 |
| ResourceApplyKerasMomentum | float32 | [100 50] | FALSE | 24000 | 5813 | 4.128677103 |
| ResourceApplyKerasMomentum | float32 | [100 100] | FALSE | 24176 | 5831 | 4.146115589 |
| ResourceApplyKerasMomentum | float32 | [100 300] | FALSE | 25184 | 6257 | 4.024932076 |
| ResourceApplyKerasMomentum | float32 | [300 100] | FALSE | 24896 | 6026 | 4.131430468 |
| ResourceApplyKerasMomentum | float32 | [200 300] | FALSE | 26064 | 6488 | 4.017262639 |
| ResourceApplyKerasMomentum | float32 | [205 350] | FALSE | 26368 | 6737 | 3.913908268 |
| ResourceApplyKerasMomentum | float32 | [350 105] | FALSE | 25184 | 6275 | 4.013386454 |
| ResourceApplyKerasMomentum | float32 | [405 200] | FALSE | 26512 | 6542 | 4.052583308 |
| ResourceApplyKerasMomentum | float32 | [10 10 10] | FALSE | 20016 | 5831 | 3.432687361 |
| ResourceApplyKerasMomentum | float32 | [10 10 30] | FALSE | 24368 | 5688 | 4.284106892 |
| ResourceApplyKerasMomentum | float32 | [10 30 10] | FALSE | 24544 | 5653 | 4.341765434 |
| ResourceApplyKerasMomentum | float32 | [30 10 10] | FALSE | 24208 | 5741 | 4.216686988 |
| ResourceApplyKerasMomentum | float32 | [30 30 30] | FALSE | 25536 | 6079 | 4.200690903 |
| ResourceApplyKerasMomentum | float32 | [50 100 50] | FALSE | 31952 | 9155 | 3.490114691 |
| ResourceApplyKerasMomentum | float32 | [100 50 100] | FALSE | 39072 | 13208 | 2.958207147 |
| ResourceApplyKerasMomentum | float32 | [100 100 100] | FALSE | 56336 | 21190 | 2.658612553 |
| ResourceApplyKerasMomentum | float32 | [10 10 10 10] | FALSE | 25024 | 5688 | 4.399437412 |
| ResourceApplyKerasMomentum | float32 | [10 10 10 30] | FALSE | 25792 | 6168 | 4.181582361 |
| ResourceApplyKerasMomentum | float32 | [30 10 10 10] | FALSE | 25312 | 6239 | 4.057060426 |
| ResourceApplyKerasMomentum | float32 | [30 30 30 30] | FALSE | 48175 | 18257 | 2.638713918 |
| ResourceApplyKerasMomentum | float32 | [50 100 50 100] | FALSE | 891670 | 388226 | 2.296780741 |
| ResourceApplyKerasMomentum | float32 | [100 50 100 50] | FALSE | 888325 | 384973 | 2.307499487 |
| ResourceApplyKerasMomentum | float32 | [100 100 100 100] | FALSE | 3535446 | 1541400 | 2.293659011 |
| ResourceApplyKerasMomentum | float32 | [10 10 10 10 10] | FALSE | 26432 | 6791 | 3.892210278 |
| ResourceApplyKerasMomentum | float32 | [10 10 10 30 30] | FALSE | 51919 | 19751 | 2.628677029 |
| ResourceApplyKerasMomentum | float32 | [30 10 10 10 10] | FALSE | 32672 | 9830 | 3.32370295 |
| ResourceApplyKerasMomentum | float32 | [10 20 15 20 20] | FALSE | 126160 | 23395 | 5.392605258 |
BFP16
| op_name | dtype | input_size | use_nesterov | rocm_kernel_avg | kernel_duration | MIOpen over rocm |
|---|---|---|---|---|---|---|
| ResourceApplyKerasMomentum | bfloat16 | [50 100] | TRUE | 43968 | 5866 | 7.495397204 |
| ResourceApplyKerasMomentum | bfloat16 | [50 100] | FALSE | 32080 | 5937 | 5.403402392 |
| ResourceApplyKerasMomentum | bfloat16 | [100 50] | TRUE | 41520 | 5706 | 7.276550999 |
| ResourceApplyKerasMomentum | bfloat16 | [100 50] | FALSE | 31200 | 5884 | 5.302515296 |
| ResourceApplyKerasMomentum | bfloat16 | [100 100] | TRUE | 39904 | 5937 | 6.721239683 |
| ResourceApplyKerasMomentum | bfloat16 | [100 100] | FALSE | 29776 | 5955 | 5.000167926 |
| ResourceApplyKerasMomentum | bfloat16 | [100 300] | TRUE | 34432 | 5973 | 5.7646074 |
| ResourceApplyKerasMomentum | bfloat16 | [100 300] | FALSE | 25408 | 5991 | 4.241028209 |
| ResourceApplyKerasMomentum | bfloat16 | [300 100] | TRUE | 34096 | 5991 | 5.691203472 |
| ResourceApplyKerasMomentum | bfloat16 | [300 100] | FALSE | 25312 | 5973 | 4.237736481 |
| ResourceApplyKerasMomentum | bfloat16 | [200 300] | TRUE | 37776 | 6204 | 6.088974855 |
| ResourceApplyKerasMomentum | bfloat16 | [200 300] | FALSE | 27280 | 6115 | 4.461161079 |
| ResourceApplyKerasMomentum | bfloat16 | [205 350] | TRUE | 32032 | 6115 | 5.238266558 |
| ResourceApplyKerasMomentum | bfloat16 | [205 350] | FALSE | 23472 | 6187 | 3.793761112 |
| ResourceApplyKerasMomentum | bfloat16 | [350 105] | TRUE | 41056 | 6329 | 6.486964765 |
| ResourceApplyKerasMomentum | bfloat16 | [350 105] | FALSE | 30416 | 5974 | 5.09139605 |
| ResourceApplyKerasMomentum | bfloat16 | [405 200] | TRUE | 31728 | 6383 | 4.970703431 |
| ResourceApplyKerasMomentum | bfloat16 | [405 200] | FALSE | 23440 | 6258 | 3.745605625 |
| ResourceApplyKerasMomentum | bfloat16 | [10 10 10] | TRUE | 44400 | 4693 | 9.460899212 |
| ResourceApplyKerasMomentum | bfloat16 | [10 10 10] | FALSE | 33312 | 4782 | 6.966122961 |
| ResourceApplyKerasMomentum | bfloat16 | [10 10 30] | TRUE | 42560 | 5600 | 7.6 |
| ResourceApplyKerasMomentum | bfloat16 | [10 10 30] | FALSE | 32000 | 5653 | 5.660711127 |
| ResourceApplyKerasMomentum | bfloat16 | [10 30 10] | TRUE | 42384 | 5725 | 7.403318777 |
| ResourceApplyKerasMomentum | bfloat16 | [10 30 10] | FALSE | 31616 | 5742 | 5.506095437 |
| ResourceApplyKerasMomentum | bfloat16 | [30 10 10] | TRUE | 41872 | 5671 | 7.383530242 |
| ResourceApplyKerasMomentum | bfloat16 | [30 10 10] | FALSE | 31728 | 5653 | 5.612595082 |
| ResourceApplyKerasMomentum | bfloat16 | [30 30 30] | TRUE | 34319 | 5902 | 5.814808539 |
| ResourceApplyKerasMomentum | bfloat16 | [30 30 30] | FALSE | 25488 | 5813 | 4.384655083 |
| ResourceApplyKerasMomentum | bfloat16 | [50 100 50] | TRUE | 32911 | 8284 | 3.972839208 |
| ResourceApplyKerasMomentum | bfloat16 | [50 100 50] | FALSE | 24736 | 8427 | 2.935326925 |
| ResourceApplyKerasMomentum | bfloat16 | [100 50 100] | TRUE | 37136 | 10276 | 3.613857532 |
| ResourceApplyKerasMomentum | bfloat16 | [100 50 100] | FALSE | 27488 | 10383 | 2.647404411 |
| ResourceApplyKerasMomentum | bfloat16 | [100 100 100] | TRUE | 56015 | 15805 | 3.544131604 |
| ResourceApplyKerasMomentum | bfloat16 | [100 100 100] | FALSE | 41808 | 15787 | 2.648254893 |
| ResourceApplyKerasMomentum | bfloat16 | [10 10 10 10] | TRUE | 39856 | 5795 | 6.877653149 |
| ResourceApplyKerasMomentum | bfloat16 | [10 10 10 10] | FALSE | 29680 | 5884 | 5.044187627 |
| ResourceApplyKerasMomentum | bfloat16 | [10 10 10 30] | TRUE | 34128 | 5849 | 5.834843563 |
| ResourceApplyKerasMomentum | bfloat16 | [10 10 10 30] | FALSE | 25376 | 6151 | 4.125508047 |
| ResourceApplyKerasMomentum | bfloat16 | [30 10 10 10] | TRUE | 34112 | 5920 | 5.762162162 |
| ResourceApplyKerasMomentum | bfloat16 | [30 10 10 10] | FALSE | 25472 | 6026 | 4.227016263 |
| ResourceApplyKerasMomentum | bfloat16 | [30 30 30 30] | TRUE | 39568 | 13885 | 2.849693914 |
| ResourceApplyKerasMomentum | bfloat16 | [30 30 30 30] | FALSE | 28880 | 13707 | 2.106952652 |
| ResourceApplyKerasMomentum | bfloat16 | [50 100 50 100] | TRUE | 660522 | 269065 | 2.454878933 |
| ResourceApplyKerasMomentum | bfloat16 | [50 100 50 100] | FALSE | 469772 | 269598 | 1.742490671 |
| ResourceApplyKerasMomentum | bfloat16 | [100 50 100 50] | TRUE | 658906 | 270167 | 2.438884098 |
| ResourceApplyKerasMomentum | bfloat16 | [100 50 100 50] | FALSE | 468268 | 269187 | 1.739563946 |
| ResourceApplyKerasMomentum | bfloat16 | [100 100 100 100] | TRUE | 2539867 | 1040770 | 2.440372993 |
| ResourceApplyKerasMomentum | bfloat16 | [100 100 100 100] | FALSE | 1808385 | 1040270 | 1.73838042 |
| ResourceApplyKerasMomentum | bfloat16 | [10 10 10 10 10] | TRUE | 36320 | 6524 | 5.567136726 |
| ResourceApplyKerasMomentum | bfloat16 | [10 10 10 10 10] | FALSE | 25584 | 6542 | 3.910730663 |
| ResourceApplyKerasMomentum | bfloat16 | [10 10 10 30 30] | TRUE | 55583 | 14738 | 3.771407247 |
| ResourceApplyKerasMomentum | bfloat16 | [10 10 10 30 30] | FALSE | 42208 | 14792 | 2.853434289 |
| ResourceApplyKerasMomentum | bfloat16 | [30 10 10 10 10] | TRUE | 39056 | 8853 | 4.411611883 |
| ResourceApplyKerasMomentum | bfloat16 | [30 10 10 10 10] | FALSE | 31712 | 8978 | 3.532189797 |
| ResourceApplyKerasMomentum | bfloat16 | [10 20 15 20 20] | TRUE | 60752 | 18098 | 3.356835009 |
| ResourceApplyKerasMomentum | bfloat16 | [10 20 15 20 20] | FALSE | 43904 | 17885 | 2.454794521 |