MIOpen icon indicating copy to clipboard operation
MIOpen copied to clipboard

Implement L1Loss

Open cognaiger9 opened this issue 1 year ago • 0 comments

  • Add L1Loss operation with forward reduced kernels.
  • Add driver and gtest for kernels.
  • MIOpen performs better if:
    • Reduction mode is either sum or mean

Average improvement over ROCm

type fwd
float16 1.92
float 1.93
bfloat16 1.9

Detail Benchmark

float16
op_name dtype size contiguous reduction direction ROCm MIOpen Improvement
L1Loss float16 [7 4] contiguous sum fwd 124259 44677 2,78
L1Loss float16 [7 4] noncontiguous sum fwd 145427 45850 3,17
L1Loss float16 [18 4] contiguous sum fwd 72674 42117 1,73
L1Loss float16 [28 4] contiguous sum fwd 120067 45157 2,66
L1Loss float16 [28 4] noncontiguous sum fwd 59538 44730 1,33
L1Loss float16 [34 4] noncontiguous sum fwd 105651 47433 2,23
L1Loss float16 [54 4] contiguous sum fwd 64209 40695 1,58
L1Loss float16 [72 4] contiguous sum fwd 108066 43752 2,47
L1Loss float16 [72 4] noncontiguous sum fwd 50754 43059 1,18
L1Loss float16 [98 4] noncontiguous sum fwd 123586 42455 2,91
L1Loss float16 [106 4] contiguous sum fwd 56545 43325 1,31
L1Loss float16 [135 4] contiguous sum fwd 119331 45050 2,65
L1Loss float16 [190 4] noncontiguous sum fwd 111459 52446 2,13
L1Loss float16 [249 128] contiguous sum fwd 100514 54828 1,83
L1Loss float16 [349 222] contiguous sum fwd 58818 44392 1,32
L1Loss float16 [349 222] noncontiguous sum fwd 77970 45352 1,72
L1Loss float16 [451 128] contiguous sum fwd 58737 50312 1,17
L1Loss float16 [451 128] noncontiguous sum fwd 62626 45352 1,38
L1Loss float16 [603 546] contiguous sum fwd 75186 46934 1,60
L1Loss float16 [603 546] noncontiguous sum fwd 75698 57193 1,32
float32
op_name dtype size contiguous reduction direction ROCm MIOpen Improvement
L1Loss float32 [7 4] contiguous sum fwd 81298 51255 1,59
L1Loss float32 [7 4] noncontiguous sum fwd 57249 44713 1,28
L1Loss float32 [18 4] contiguous sum fwd 104194 45122 2,31
L1Loss float32 [28 4] contiguous sum fwd 55697 46224 1,20
L1Loss float32 [28 4] noncontiguous sum fwd 118723 44161 2,69
L1Loss float32 [34 4] noncontiguous sum fwd 58033 46650 1,24
L1Loss float32 [54 4] contiguous sum fwd 123811 44001 2,81
L1Loss float32 [72 4] contiguous sum fwd 60945 43308 1,41
L1Loss float32 [72 4] noncontiguous sum fwd 113218 43735 2,59
L1Loss float32 [98 4] noncontiguous sum fwd 73282 39147 1,87
L1Loss float32 [106 4] contiguous sum fwd 110131 47041 2,34
L1Loss float32 [135 4] noncontiguous sum fwd 114659 43130 2,66
L1Loss float32 [190 4] noncontiguous sum fwd 78946 46810 1,69
L1Loss float32 [207 4] contiguous sum fwd 109475 41245 2,65
L1Loss float32 [207 4] noncontiguous sum fwd 45905 43219 1,06
L1Loss float32 [249 128] noncontiguous sum fwd 133555 42952 3,11
L1Loss float32 [349 222] contiguous sum fwd 53745 44836 1,20
L1Loss float32 [451 128] contiguous sum fwd 119347 44676 2,67
L1Loss float32 [451 128] noncontiguous sum fwd 58114 44375 1,31
L1Loss float32 [603 546] contiguous sum fwd 64529 45992 1,40
L1Loss float32 [603 546] noncontiguous sum fwd 75073 55557 1,35
bfloat16
op_name dtype size contiguous reduction direction ROCm MIOpen Improvement
L1Loss bfloat16 [7 4] contiguous sum fwd 52609 45584 1,15
L1Loss bfloat16 [18 4] contiguous sum fwd 112019 40624 2,76
L1Loss bfloat16 [18 4] noncontiguous sum fwd 113763 48659 2,34
L1Loss bfloat16 [28 4] noncontiguous sum fwd 97154 46846 2,07
L1Loss bfloat16 [34 4] contiguous sum fwd 85330 43824 1,95
L1Loss bfloat16 [54 4] contiguous sum fwd 89058 44944 1,98
L1Loss bfloat16 [54 4] noncontiguous sum fwd 99987 44801 2,23
L1Loss bfloat16 [72 4] noncontiguous sum fwd 103539 44108 2,35
L1Loss bfloat16 [98 4] contiguous sum fwd 79794 43877 1,82
L1Loss bfloat16 [98 4] noncontiguous sum fwd 47489 42686 1,11
L1Loss bfloat16 [106 4] contiguous sum fwd 128467 44979 2,86
L1Loss bfloat16 [106 4] noncontiguous sum fwd 97250 45086 2,16
L1Loss bfloat16 [135 4] noncontiguous sum fwd 109411 42953 2,55
L1Loss bfloat16 [190 4] contiguous sum fwd 74913 47112 1,59
L1Loss bfloat16 [207 4] contiguous sum fwd 116563 45939 2,54
L1Loss bfloat16 [207 4] noncontiguous sum fwd 84306 45317 1,86
L1Loss bfloat16 [349 222] contiguous sum fwd 79554 52535 1,51
L1Loss bfloat16 [349 222] noncontiguous sum fwd 60081 44926 1,34
L1Loss bfloat16 [451 128] contiguous sum fwd 58866 44392 1,33
L1Loss bfloat16 [451 128] noncontiguous sum fwd 96659 44943 2,15
L1Loss bfloat16 [603 546] contiguous sum fwd 78369 54686 1,43
L1Loss bfloat16 [603 546] noncontiguous sum fwd 74386 54117 1,37
L1Loss bfloat16 [1024 1024] contiguous sum fwd 83682 70349 1,19

cognaiger9 avatar Nov 22 '24 03:11 cognaiger9