MIOpen icon indicating copy to clipboard operation
MIOpen copied to clipboard

Implement IndexSelect

Open cognaiger9 opened this issue 10 months ago • 0 comments

  • Add IndexSelect operation with forward and backward kernels.
  • Add driver and gtest for kernels.
  • MIOpen performs better if:
    • Number of output elements is less than 100000

Average improvement over ROCm

type fwd bwd
float16 1.6 1.94
float 1.67 1.55
bfloat16 1.63 1.76

Detail Benchmark

float16 (forward)
op_name dtype input indices cont dim direction ROCm MIOpen Improvement
IndexSelect float16 [16 16 16] [5] noncont 0 fwd 12896 10737 1.20
IndexSelect float16 [16 32 32] [16] cont 0 fwd 13168 8124 1.62
IndexSelect float16 [16 32 32] [16] noncont 0 fwd 19728 9066 2.18
IndexSelect float16 [16 32 32] [16] cont 1 fwd 16320 8675 1.88
IndexSelect float16 [16 32 32] [16] noncont 1 fwd 16832 9191 1.83
IndexSelect float16 [32 32 32] [5] cont 1 fwd 10576 8871 1.19
IndexSelect float16 [32 32 32] [5] noncont 1 fwd 11504 9048 1.27
IndexSelect float16 [32 32 32] [5] noncont 2 fwd 13056 9368 1.39
IndexSelect float16 [100 50 20] [10] cont 0 fwd 12288 8337 1.47
IndexSelect float16 [100 50 20] [10] noncont 0 fwd 18160 9990 1.82
IndexSelect float16 [100 50 20] [10] cont 1 fwd 15456 8213 1.88
IndexSelect float16 [100 50 20] [10] noncont 1 fwd 15824 9031 1.75
IndexSelect float16 [100 50 20] [10] cont 2 fwd 11056 8746 1.26
IndexSelect float16 [100 50 20] [10] noncont 2 fwd 18096 10364 1.75
float32 (forward)
op_name dtype input indices cont dim direction ROCm MIOpen Improvement
IndexSelect float32 [16 16 16] [5] noncont 0 fwd 12720 9439 1.35
IndexSelect float32 [16 16 16] [5] noncont 2 fwd 12176 9457 1.29
IndexSelect float32 [16 32 32] [16] cont 0 fwd 13152 8159 1.61
IndexSelect float32 [16 32 32] [16] noncont 0 fwd 19408 9155 2.12
IndexSelect float32 [16 32 32] [16] cont 1 fwd 17120 8408 2.04
IndexSelect float32 [16 32 32] [16] noncont 1 fwd 17728 8746 2.03
IndexSelect float32 [32 32 32] [5] cont 2 fwd 10064 8871 1.13
IndexSelect float32 [32 32 32] [5] noncont 2 fwd 13216 9262 1.43
IndexSelect float32 [100 50 20] [10] cont 0 fwd 12416 8462 1.47
IndexSelect float32 [100 50 20] [10] noncont 0 fwd 19136 9155 2.09
IndexSelect float32 [100 50 20] [10] cont 1 fwd 15887 8284 1.92
IndexSelect float32 [100 50 20] [10] noncont 1 fwd 16016 9546 1.68
IndexSelect float32 [100 50 20] [10] cont 2 fwd 12224 8835 1.38
IndexSelect float32 [100 50 20] [10] noncont 2 fwd 18704 9884 1.89
bfloat16 (forward)
op_name dtype input indices cont dim direction ROCm MIOpen Improvement
IndexSelect bfloat16 [16 16 16] [5] noncont 0 fwd 12864 10364 1.24
IndexSelect bfloat16 [16 32 32] [16] cont 0 fwd 12976 8266 1.57
IndexSelect bfloat16 [16 32 32] [16] noncont 0 fwd 19232 9031 2.13
IndexSelect bfloat16 [16 32 32] [16] cont 1 fwd 16576 7768 2.13
IndexSelect bfloat16 [16 32 32] [16] noncont 1 fwd 16608 8942 1.86
IndexSelect bfloat16 [32 32 32] [5] cont 0 fwd 9040 9315 0.97
IndexSelect bfloat16 [32 32 32] [5] noncont 0 fwd 12208 9244 1.32
IndexSelect bfloat16 [100 50 20] [10] cont 0 fwd 12048 8764 1.37
IndexSelect bfloat16 [100 50 20] [10] noncont 0 fwd 17856 9386 1.90
IndexSelect bfloat16 [100 50 20] [10] cont 1 fwd 15536 8231 1.89
IndexSelect bfloat16 [100 50 20] [10] noncont 1 fwd 15696 8906 1.76
IndexSelect bfloat16 [100 50 20] [10] cont 2 fwd 11056 8728 1.27
IndexSelect bfloat16 [100 50 20] [10] noncont 2 fwd 18208 10239 1.78
float16 (backward)
op_name dtype input indices cont dim direction ROCm MIOpen Improvement
IndexSelect float16 [16 16 16] [5] cont 0 bwd 66127 23928 2.76
IndexSelect float16 [16 16 16] [5] noncont 0 bwd 73551 26595 2.77
IndexSelect float16 [16 32 32] [16] cont 0 bwd 42000 25653 1.64
IndexSelect float16 [16 32 32] [16] noncont 0 bwd 57120 26221 2.18
IndexSelect float16 [16 32 32] [16] cont 2 bwd 39765 27057 1.47
IndexSelect float16 [16 32 32] [16] noncont 2 bwd 56479 26151 2.16
IndexSelect float16 [32 32 32] [5] cont 0 bwd 30656 23644 1.30
IndexSelect float16 [32 32 32] [5] noncont 0 bwd 40944 26186 1.56
IndexSelect float16 [100 50 20] [10] cont 0 bwd 37056 25475 1.45
IndexSelect float16 [100 50 20] [10] noncont 0 bwd 62303 27324 2.28
IndexSelect float16 [100 50 20] [10] cont 1 bwd 38160 23110 1.65
IndexSelect float16 [100 50 20] [10] noncont 1 bwd 56863 23502 2.42
IndexSelect float16 [100 50 20] [10] noncont 2 bwd 50831 32497 1.56
float32 (backward)
op_name dtype input indices cont dim direction ROCm MIOpen Improvement
IndexSelect float32 [16 16 16] [5] cont 0 bwd 33808 25333 1.33
IndexSelect float32 [16 16 16] [5] noncont 0 bwd 39168 24213 1.62
IndexSelect float32 [16 32 32] [16] cont 0 bwd 28800 23555 1.22
IndexSelect float32 [16 32 32] [16] noncont 0 bwd 48048 25670 1.87
IndexSelect float32 [16 32 32] [16] cont 1 bwd 34592 26275 1.32
IndexSelect float32 [16 32 32] [16] noncont 1 bwd 44592 27377 1.63
IndexSelect float32 [32 32 32] [5] noncont 0 bwd 55616 24213 2.30
IndexSelect float32 [100 50 20] [10] noncont 0 bwd 36624 25795 1.42
IndexSelect float32 [100 50 20] [10] noncont 1 bwd 43904 27377 1.60
IndexSelect float32 [100 50 20] [10] noncont 2 bwd 36464 29884 1.22
bfloat16 (backward)
op_name dtype input indices cont dim direction ROCm MIOpen Improvement
IndexSelect bfloat16 [16 16 16] [5] cont 0 bwd 55472 30346 1.83
IndexSelect bfloat16 [16 16 16] [5] noncont 0 bwd 51999 31857 1.63
IndexSelect bfloat16 [16 32 32] [16] cont 0 bwd 49888 24942 2.00
IndexSelect bfloat16 [16 32 32] [16] noncont 0 bwd 55712 25119 2.22
IndexSelect bfloat16 [16 32 32] [16] cont 1 bwd 50352 28515 1.77
IndexSelect bfloat16 [16 32 32] [16] noncont 1 bwd 58975 27235 2.17
IndexSelect bfloat16 [32 32 32] [5] cont 1 bwd 33392 24835 1.34
IndexSelect bfloat16 [32 32 32] [5] noncont 1 bwd 41488 25280 1.64
IndexSelect bfloat16 [100 50 20] [10] cont 0 bwd 40416 26257 1.54
IndexSelect bfloat16 [100 50 20] [10] noncont 0 bwd 43488 24373 1.78
IndexSelect bfloat16 [100 50 20] [10] cont 1 bwd 39376 26950 1.46
IndexSelect bfloat16 [100 50 20] [10] noncont 1 bwd 51776 25671 2.02
IndexSelect bfloat16 [100 50 20] [10] noncont 2 bwd 42688 27661 1.54

cognaiger9 avatar Mar 10 '25 10:03 cognaiger9