iree icon indicating copy to clipboard operation
iree copied to clipboard

Shared memory failure on softmax

Open rsuderman opened this issue 6 months ago • 3 comments

When compiling the following IR: https://gist.github.com/rsuderman/c2ff931ca5ddad20f061f31f7a8847a2

We see a shared memory failure: https://gist.github.com/rsuderman/318fb1db5735d9d311060dfeebdf2cfd

rsuderman avatar Jun 16 '25 22:06 rsuderman

@pashu123 can you take a look. cc @Groverkss

MaheshRavishankar avatar Jun 16 '25 22:06 MaheshRavishankar

Using the command: iree-compile test_softmax.mlir --iree-hip-target=gfx942 -o=abc.vmfb -iree-opt-level=O3 --iree-hal-target-device=hip. The problematic dispatch is https://gist.github.com/pashu123/74778fcd0526039861913d503e5b8e84

I see a gather followed by a softmax.

      %13 = linalg.generic {indexing_maps = [affine_map<(d0, d1) -> (d0)>, affine_map<(d0, d1) -> (d0)>, affine_map<(d0, d1) -> (d0, d1)>], iterator_types = ["parallel", "parallel"]} ins(%cst, %10 : tensor<4xi64>, tensor<4xi64>) outs(%12 : tensor<4x128256xf32>) {
        ^bb0(%in: i64, %in_0: i64, %out: f32):
          %16 = arith.subi %in_0, %c1_i64 : i64
          %17 = linalg.index 1 : index
          %18 = arith.cmpi slt, %in, %c0_i64 : i64
          %19 = arith.addi %in, %c4_i64 : i64
          %20 = arith.select %18, %19, %in : i64
          %21 = arith.index_cast %20 : i64 to index
          %22 = arith.cmpi slt, %16, %c0_i64 : i64
          %23 = arith.index_castui %7 : index to i64
          %24 = arith.addi %16, %23 : i64
          %25 = arith.select %22, %24, %16 : i64
          %26 = arith.index_cast %25 : i64 to index
          %extracted = tensor.extract %9[%21, %26, %17] : tensor<4x?x128256xf16>
          %27 = arith.extf %extracted : f16 to f32
          linalg.yield %27 : f32
        } -> tensor<4x128256xf32>
        %14 = linalg.softmax dimension(1) ins(%13 : tensor<4x128256xf32>) outs(%12 : tensor<4x128256xf32>) -> tensor<4x128256xf32>

pashu123 avatar Jun 17 '25 08:06 pashu123

Raised the patch here for the fix: https://github.com/iree-org/iree/pull/21117

pashu123 avatar Jun 17 '25 15:06 pashu123