MIOpen icon indicating copy to clipboard operation
MIOpen copied to clipboard

Implement SoftmaxCrossEntropyWithLogits

Open hieule88 opened this issue 10 months ago • 0 comments

  • Added contiguous SoftmaxCrossEntropyWithLogits forward and backward contiguous operation and kernel.
  • Added driver test and gtest for SoftmaxCrossEntropyWithLogits .
  • New API is guarded by MIOPEN_BETA_API macro.
  • Average over all cases:
type Forward Backward
float16 3.01 4.27
float32 2.72 2.99
bfloat16 2.99 4.43
FP16
op_name dtype size direction rocm_kernel_avg kernel_duration improvement over rocm
SoftmaxCrossEntropyWithLogits float16 [20 30] fwd 34080 8071 4.22252509
SoftmaxCrossEntropyWithLogits float16 [20 30] bwd 57758 7129 8.101837565
SoftmaxCrossEntropyWithLogits float16 [5 10] fwd 37919 9813 3.864159788
SoftmaxCrossEntropyWithLogits float16 [5 10] bwd 37119 8338 4.451786999
SoftmaxCrossEntropyWithLogits float16 [2 5] fwd 31679 10293 3.077722724
SoftmaxCrossEntropyWithLogits float16 [2 5] bwd 33118 8924 3.711116091
SoftmaxCrossEntropyWithLogits float16 [25 300] fwd 32159 11324 2.839897563
SoftmaxCrossEntropyWithLogits float16 [25 300] bwd 59038 10080 5.856944444
SoftmaxCrossEntropyWithLogits float16 [25 100] fwd 34719 8107 4.282595288
SoftmaxCrossEntropyWithLogits float16 [25 100] bwd 46079 7662 6.013965022
SoftmaxCrossEntropyWithLogits float16 [100 20] fwd 27519 6809 4.041562638
SoftmaxCrossEntropyWithLogits float16 [100 20] bwd 37278 7129 5.229064385
SoftmaxCrossEntropyWithLogits float16 [100 10] fwd 27840 6169 4.512887016
SoftmaxCrossEntropyWithLogits float16 [100 10] bwd 55839 6347 8.797699701
SoftmaxCrossEntropyWithLogits float16 [2000 3000] fwd 163515 133065 1.228835532
SoftmaxCrossEntropyWithLogits float16 [2000 3000] bwd 225273 120549 1.86872558
SoftmaxCrossEntropyWithLogits float16 [25 1000] fwd 32799 17599 1.863685437
SoftmaxCrossEntropyWithLogits float16 [25 1000] bwd 49598 16960 2.924410377
SoftmaxCrossEntropyWithLogits float16 [1000 100] fwd 37279 9635 3.869122989
SoftmaxCrossEntropyWithLogits float16 [1000 100] bwd 44959 8123 5.534777791
FP32
op_name dtype size direction rocm_kernel_avg kernel_duration improvement over rocm
SoftmaxCrossEntropyWithLogits float32 [20 30] fwd 29439 8267 3.561025765
SoftmaxCrossEntropyWithLogits float32 [20 30] bwd 29440 7164 4.109436069
SoftmaxCrossEntropyWithLogits float32 [5 10] fwd 33760 9867 3.42150603
SoftmaxCrossEntropyWithLogits float32 [5 10] bwd 20319 8249 2.463207662
SoftmaxCrossEntropyWithLogits float32 [2 5] fwd 24960 9706 2.571605193
SoftmaxCrossEntropyWithLogits float32 [2 5] bwd 20160 8409 2.397431324
SoftmaxCrossEntropyWithLogits float32 [25 300] fwd 25440 11164 2.278753135
SoftmaxCrossEntropyWithLogits float32 [25 300] bwd 28799 10933 2.634135187
SoftmaxCrossEntropyWithLogits float32 [25 100] fwd 31839 8462 3.762585677
SoftmaxCrossEntropyWithLogits float32 [25 100] bwd 25280 8071 3.13220171
SoftmaxCrossEntropyWithLogits float32 [100 20] fwd 27998 7413 3.776878457
SoftmaxCrossEntropyWithLogits float32 [100 20] bwd 24958 7324 3.40770071
SoftmaxCrossEntropyWithLogits float32 [100 10] fwd 27199 6862 3.963713203
SoftmaxCrossEntropyWithLogits float32 [100 10] bwd 25599 6898 3.711075674
SoftmaxCrossEntropyWithLogits float32 [2000 3000] fwd 169915 200779 0.846278744
SoftmaxCrossEntropyWithLogits float32 [2000 3000] bwd 171355 187411 0.914327334
SoftmaxCrossEntropyWithLogits float32 [25 1000] fwd 32799 18115 1.810598951
SoftmaxCrossEntropyWithLogits float32 [25 1000] bwd 33919 17173 1.975135387
SoftmaxCrossEntropyWithLogits float32 [1000 100] fwd 33120 9884 3.350870093
SoftmaxCrossEntropyWithLogits float32 [1000 100] bwd 34079 8498 4.010237703
BFP16
op_name dtype size direction rocm_kernel_avg kernel_duration improvement over rocm
SoftmaxCrossEntropyWithLogits bfloat16 [20 30] fwd 36313 7822 4.642418819
SoftmaxCrossEntropyWithLogits bfloat16 [20 30] bwd 63667 7093 8.976032708
SoftmaxCrossEntropyWithLogits bfloat16 [5 10] fwd 38552 10507 3.669172932
SoftmaxCrossEntropyWithLogits bfloat16 [5 10] bwd 39352 8746 4.49942831
SoftmaxCrossEntropyWithLogits bfloat16 [2 5] fwd 40470 11004 3.677753544
SoftmaxCrossEntropyWithLogits bfloat16 [2 5] bwd 39671 8622 4.601136627
SoftmaxCrossEntropyWithLogits bfloat16 [25 300] fwd 32154 11111 2.893888939
SoftmaxCrossEntropyWithLogits bfloat16 [25 300] bwd 64466 10346 6.231007153
SoftmaxCrossEntropyWithLogits bfloat16 [25 100] fwd 35672 8444 4.224538134
SoftmaxCrossEntropyWithLogits bfloat16 [25 100] bwd 41272 8320 4.960576923
SoftmaxCrossEntropyWithLogits bfloat16 [100 20] fwd 31033 7324 4.237165483
SoftmaxCrossEntropyWithLogits bfloat16 [100 20] bwd 43192 6933 6.2299149
SoftmaxCrossEntropyWithLogits bfloat16 [100 10] fwd 34393 6222 5.527643844
SoftmaxCrossEntropyWithLogits bfloat16 [100 10] bwd 44629 5955 7.494374475
SoftmaxCrossEntropyWithLogits bfloat16 [2000 3000] fwd 160445 134291 1.194756164
SoftmaxCrossEntropyWithLogits bfloat16 [2000 3000] bwd 232272 121900 1.905430681
SoftmaxCrossEntropyWithLogits bfloat16 [25 1000] fwd 35992 18009 1.998556277
SoftmaxCrossEntropyWithLogits bfloat16 [25 1000] bwd 66546 17120 3.88703271
SoftmaxCrossEntropyWithLogits bfloat16 [1000 100] fwd 38072 9831 3.872647747
SoftmaxCrossEntropyWithLogits bfloat16 [1000 100] bwd 54389 8427 6.454135517

hieule88 avatar Mar 03 '25 10:03 hieule88