MIOpen icon indicating copy to clipboard operation
MIOpen copied to clipboard

Implement SparseSoftmaxCrossEntropyWithLogits

Open hieule88 opened this issue 10 months ago • 0 comments

  • Added SparseSoftmaxCrossEntropyWithLogits forward and backward.

  • Added driver test and gtest for SparseSoftmaxCrossEntropyWithLogits.

  • New API is guarded by MIOPEN_BETA_API macro.

  • Average over all cases:

  • SparseSoftmaxCrossEntropyWithLogits

Type Forward Backward
float16 1.84 3.29
float32 1.68 3.09
bfloat16 1.85 3.31

FWD - FP16
op_name dtype size direction Rocm kernel avg MIOpen kernel avg ROCm / MIOpen
SparseSoftmaxCrossEntropyWithLogits float16 [128 4] fwd 12272 5546 2.212765957
SparseSoftmaxCrossEntropyWithLogits float16 [100 20] fwd 15280 6133 2.491439752
SparseSoftmaxCrossEntropyWithLogits float16 [100 10] fwd 16304 5742 2.83942877
SparseSoftmaxCrossEntropyWithLogits float16 [25 1000] fwd 22944 13920 1.648275862
SparseSoftmaxCrossEntropyWithLogits float16 [1000 100] fwd 18960 8373 2.264421354
SparseSoftmaxCrossEntropyWithLogits float16 [237 80] fwd 16880 7058 2.391612355
SparseSoftmaxCrossEntropyWithLogits float16 [255 80] fwd 16464 7360 2.236956522
SparseSoftmaxCrossEntropyWithLogits float16 [258 80] fwd 16176 7342 2.203214383
SparseSoftmaxCrossEntropyWithLogits float16 [261 80] fwd 15728 7324 2.147460404
SparseSoftmaxCrossEntropyWithLogits float16 [267 80] fwd 16144 7147 2.258849867
SparseSoftmaxCrossEntropyWithLogits float16 [768 80] fwd 17872 7520 2.376595745
SparseSoftmaxCrossEntropyWithLogits float16 [800 237] fwd 18480 9262 1.995249406
SparseSoftmaxCrossEntropyWithLogits float16 [800 585] fwd 23744 13991 1.697090987
SparseSoftmaxCrossEntropyWithLogits float16 [800 645] fwd 24912 14524 1.715229964

FWD - FP32
op_name dtype size direction Rocm kernel avg MIOpen kernel avg ROCm / MIOpen
SparseSoftmaxCrossEntropyWithLogits float32 [147 4] fwd 12432 5724 2.171907757
SparseSoftmaxCrossEntropyWithLogits float32 [20 30] fwd 14576 7751 1.880531544
SparseSoftmaxCrossEntropyWithLogits float32 [5 10] fwd 13200 10027 1.316445597
SparseSoftmaxCrossEntropyWithLogits float32 [2 5] fwd 12128 10471 1.158246586
SparseSoftmaxCrossEntropyWithLogits float32 [25 300] fwd 16560 9387 1.764141898
SparseSoftmaxCrossEntropyWithLogits float32 [25 100] fwd 16176 7680 2.10625
SparseSoftmaxCrossEntropyWithLogits float32 [100 20] fwd 13296 7005 1.898072805
SparseSoftmaxCrossEntropyWithLogits float32 [100 10] fwd 13520 6080 2.223684211
SparseSoftmaxCrossEntropyWithLogits float32 [25 1000] fwd 19360 14347 1.349411027
SparseSoftmaxCrossEntropyWithLogits float32 [1000 100] fwd 16832 8462 1.989127866
SparseSoftmaxCrossEntropyWithLogits float32 [237 80] fwd 14128 7129 1.981764623
SparseSoftmaxCrossEntropyWithLogits float32 [663 80] fwd 14928 7662 1.948316366
SparseSoftmaxCrossEntropyWithLogits float32 [800 237] fwd 16672 9333 1.786349512
SparseSoftmaxCrossEntropyWithLogits float32 [800 285] fwd 16880 10755 1.569502557

FWD - BFP16
op_name dtype size direction Rocm kernel avg MIOpen kernel avg ROCm / MIOpen
SparseSoftmaxCrossEntropyWithLogits bfloat16 [20 30] fwd 16192 7947 2.037498427
SparseSoftmaxCrossEntropyWithLogits bfloat16 [5 10] fwd 15808 10311 1.533119969
SparseSoftmaxCrossEntropyWithLogits bfloat16 [2 5] fwd 14448 10773 1.341130604
SparseSoftmaxCrossEntropyWithLogits bfloat16 [25 300] fwd 20048 9458 2.119687037
SparseSoftmaxCrossEntropyWithLogits bfloat16 [25 100] fwd 18976 7129 2.6618039
SparseSoftmaxCrossEntropyWithLogits bfloat16 [100 20] fwd 15456 6169 2.505430378
SparseSoftmaxCrossEntropyWithLogits bfloat16 [100 10] fwd 16176 5707 2.834413878
SparseSoftmaxCrossEntropyWithLogits bfloat16 [25 1000] fwd 26976 14454 1.866334579
SparseSoftmaxCrossEntropyWithLogits bfloat16 [1000 100] fwd 19936 8391 2.375878918
SparseSoftmaxCrossEntropyWithLogits bfloat16 [237 80] fwd 17584 7093 2.479063866
SparseSoftmaxCrossEntropyWithLogits bfloat16 [255 80] fwd 16992 7342 2.314355761
SparseSoftmaxCrossEntropyWithLogits bfloat16 [800 372] fwd 20496 10969 1.868538609

BWD - FP16
op_name dtype size direction Rocm kernel avg MIOpen kernel avg ROCm / MIOpen
SparseSoftmaxCrossEntropyWithLogits float16 [20 30] bwd 16496 3537 4.663839412
SparseSoftmaxCrossEntropyWithLogits float16 [5 10] bwd 14848 3182 4.666247643
SparseSoftmaxCrossEntropyWithLogits float16 [2 5] bwd 14320 3431 4.173710289
SparseSoftmaxCrossEntropyWithLogits float16 [25 300] bwd 18608 5653 3.29170352
SparseSoftmaxCrossEntropyWithLogits float16 [25 100] bwd 16912 4409 3.835790429
SparseSoftmaxCrossEntropyWithLogits float16 [100 20] bwd 14000 4249 3.294892916
SparseSoftmaxCrossEntropyWithLogits float16 [100 10] bwd 15184 3946 3.847947288
SparseSoftmaxCrossEntropyWithLogits float16 [2000 3000] bwd 71695 32962 2.175080396
SparseSoftmaxCrossEntropyWithLogits float16 [25 1000] bwd 27264 6364 4.284098052
SparseSoftmaxCrossEntropyWithLogits float16 [1000 100] bwd 18528 5778 3.206645898
SparseSoftmaxCrossEntropyWithLogits float16 [237 80] bwd 16512 5280 3.127272727
SparseSoftmaxCrossEntropyWithLogits float16 [489 80] bwd 16464 5049 3.260843731
SparseSoftmaxCrossEntropyWithLogits float16 [693 80] bwd 15968 4924 3.242891958
SparseSoftmaxCrossEntropyWithLogits float16 [744 80] bwd 16192 4978 3.252711933
SparseSoftmaxCrossEntropyWithLogits float16 [800 261] bwd 18704 5458 3.426896299

BWD - FP32
op_name dtype size direction Rocm kernel avg MIOpen kernel avg ROCm / MIOpen
SparseSoftmaxCrossEntropyWithLogits float32 [20 30] bwd 14192 3982 3.564038172
SparseSoftmaxCrossEntropyWithLogits float32 [5 10] bwd 13088 3058 4.279921517
SparseSoftmaxCrossEntropyWithLogits float32 [2 5] bwd 13584 3324 4.086642599
SparseSoftmaxCrossEntropyWithLogits float32 [25 300] bwd 16656 5600 2.974285714
SparseSoftmaxCrossEntropyWithLogits float32 [25 100] bwd 15840 5227 3.030418978
SparseSoftmaxCrossEntropyWithLogits float32 [100 20] bwd 14352 4995 2.873273273
SparseSoftmaxCrossEntropyWithLogits float32 [100 10] bwd 14432 3929 3.673199287
SparseSoftmaxCrossEntropyWithLogits float32 [2000 3000] bwd 106559 54900 1.940965392
SparseSoftmaxCrossEntropyWithLogits float32 [25 1000] bwd 21808 6329 3.445726023
SparseSoftmaxCrossEntropyWithLogits float32 [1000 100] bwd 16144 5315 3.037441204
SparseSoftmaxCrossEntropyWithLogits float32 [363 80] bwd 13888 4782 2.904224174
SparseSoftmaxCrossEntropyWithLogits float32 [534 80] bwd 14080 4711 2.988749735
SparseSoftmaxCrossEntropyWithLogits float32 [744 80] bwd 14288 4728 3.021996616
SparseSoftmaxCrossEntropyWithLogits float32 [800 255] bwd 16768 5333 3.144196512

FWD - BFP16
op_name dtype size direction Rocm kernel avg MIOpen kernel avg ROCm / MIOpen
SparseSoftmaxCrossEntropyWithLogits bfloat16 [20 30] bwd 16672 3662 4.552703441
SparseSoftmaxCrossEntropyWithLogits bfloat16 [5 10] bwd 14704 3200 4.595
SparseSoftmaxCrossEntropyWithLogits bfloat16 [2 5] bwd 14672 3982 3.684580613
SparseSoftmaxCrossEntropyWithLogits bfloat16 [25 300] bwd 19552 5475 3.571141553
SparseSoftmaxCrossEntropyWithLogits bfloat16 [25 100] bwd 17472 4640 3.765517241
SparseSoftmaxCrossEntropyWithLogits bfloat16 [100 20] bwd 14752 4231 3.486646183
SparseSoftmaxCrossEntropyWithLogits bfloat16 [100 10] bwd 15488 3644 4.250274424
SparseSoftmaxCrossEntropyWithLogits bfloat16 [2000 3000] bwd 72624 33566 2.163617947
SparseSoftmaxCrossEntropyWithLogits bfloat16 [25 1000] bwd 27936 6364 4.389692018
SparseSoftmaxCrossEntropyWithLogits bfloat16 [1000 100] bwd 18944 5795 3.269025022
SparseSoftmaxCrossEntropyWithLogits bfloat16 [237 80] bwd 16400 5263 3.116093483
SparseSoftmaxCrossEntropyWithLogits bfloat16 [489 80] bwd 16128 4995 3.228828829
SparseSoftmaxCrossEntropyWithLogits bfloat16 [800 324] bwd 19184 5991 3.202136538

hieule88 avatar Feb 18 '25 06:02 hieule88