iree icon indicating copy to clipboard operation
iree copied to clipboard

Dimension mismatch after the bubble-up-expand-shapes pass

Open YashDeshpande25 opened this issue 6 months ago • 5 comments

In the MakeSingleDispatchPipeline we have tensor.pad + convs and in presence of unit dims, we drop unit dims. But then when we bubble up expand shape, we actually sink a collapse through the pad and then the collapse stays between the pad and the conv which is not supported in codegen as a single dispatch.

Link to the full IR dump

Link to the gist containing the complete IR referenced here.

For this IR

  %cst = arith.constant 0.000000e+00 : bf16
  %cst_0 = arith.constant 0.000000e+00 : f32
  %0 = hal.tensor.import wait(%arg2) => %arg0 : !hal.buffer_view -> tensor<16x1x30x576xbf16>
  %1 = hal.tensor.import wait(%arg2) => %arg1 : !hal.buffer_view -> tensor<576x1x3x576xbf16>
  %collapsed = tensor.collapse_shape %0 [[0, 1], [2], [3]] : tensor<16x1x30x576xbf16> into tensor<16x30x576xbf16>
  %padded = tensor.pad %collapsed low[0, 1, 0] high[0, 1, 0] {
  ^bb0(%arg4: index, %arg5: index, %arg6: index):
    tensor.yield %cst : bf16
  } : tensor<16x30x576xbf16> to tensor<16x32x576xbf16>
  %collapsed_1 = tensor.collapse_shape %1 [[0, 1], [2], [3]] : tensor<576x1x3x576xbf16> into tensor<576x3x576xbf16>
  %2 = tensor.empty() : tensor<16x30x576xf32>
  %3 = linalg.fill ins(%cst_0 : f32) outs(%2 : tensor<16x30x576xf32>) -> tensor<16x30x576xf32>
  %4 = linalg.generic {indexing_maps = 
[affine_map<(d0, d1, d2, d3, d4) -> (d0, d1 + d3, d4)>, 
affine_map<(d0, d1, d2, d3, d4) -> (d2, d3, d4)>, 
affine_map<(d0, d1, d2, d3, d4) -> (d0, d1, d2)>], 
iterator_types = ["parallel", "parallel", "parallel", "reduction", "reduction"]} i
ns(%padded, %collapsed_1 : tensor<16x32x576xbf16>, tensor<576x3x576xbf16>) outs(%3 : tensor<16x30x576xf32>)

using the following command

iree-opt --pass-pipeline="builtin.module(func.func(iree-dispatch-creation-bubble-up-expand-shapes),canonicalize,cse)" test.mlir

Resultant IR

    %padded = tensor.pad %0 low[0, 0, 1, 0] high[0, 0, 1, 0] {
    ^bb0(%arg4: index, %arg5: index, %arg6: index, %arg7: index):
      tensor.yield %cst : bf16
    } : tensor<16x1x30x576xbf16> to tensor<16x1x32x576xbf16>
    %collapsed = tensor.collapse_shape %padded [[0, 1], [2], [3]] : tensor<16x1x32x576xbf16> into tensor<16x32x576xbf16>
    %collapsed_1 = tensor.collapse_shape %1 [[0, 1], [2], [3]] : tensor<576x1x3x576xbf16> into tensor<576x3x576xbf16>
    %2 = tensor.empty() : tensor<16x30x576xf32>
    %3 = linalg.fill ins(%cst_0 : f32) outs(%2 : tensor<16x30x576xf32>) -> tensor<16x30x576xf32>
    %4 = linalg.generic 
{indexing_maps = [#map, #map1, #map2], 
iterator_types = ["parallel", "parallel", "parallel", "reduction", "reduction"]} 
ins(%collapsed, %collapsed_1 : tensor<16x32x576xbf16>, tensor<576x3x576xbf16>) outs(%3 : tensor<16x30x576xf32>) {

In the output IR after the pass, %padded has the extra dimension

YashDeshpande25 avatar Jun 06 '25 20:06 YashDeshpande25

@MaheshRavishankar and @IanWood1 do you guys have thoughts on how to solve this? I was thinking we could have an option to not sink collapseshape through pad and use it in the MakseSingleDispatch, or should some other pattern be taking care of this?

nirvedhmeshram avatar Jun 06 '25 21:06 nirvedhmeshram

I see, the reshape can't propagate past the conv due to non-projected permutation indexing maps.

I was thinking we could have an option to not sink collapseshape through pad

I think that makes sense. But convs in general might have a problem with reshape propagation.

Also, I had a PR https://github.com/iree-org/iree/pull/20108 to not propagate these when on the edges. I think it just needs a re-review from mahesh.

IanWood1 avatar Jun 06 '25 21:06 IanWood1

I thought #20108 had API dependency concerns by Ben so we decided to not go though with it? Edit : oh I see Ben has approved it.

nirvedhmeshram avatar Jun 06 '25 22:06 nirvedhmeshram

Ill review this on Monday. We should be able to land it overall. I think I just need a bit more clarity on all the checks.

MaheshRavishankar avatar Jun 06 '25 23:06 MaheshRavishankar

Sounds good! Is it still useful to also have the option to not sink through the pad as we might not always have it as a edge op e.g in backward convolution cases?

nirvedhmeshram avatar Jun 07 '25 00:06 nirvedhmeshram

Confirmed that after the PR from Ian this issue isnt happening

nirvedhmeshram avatar Jun 30 '25 16:06 nirvedhmeshram