heir icon indicating copy to clipboard operation
heir copied to clipboard

test: secret_inserts.mlir is flakey

Open asraa opened this issue 1 year ago • 5 comments

At head in CI I see that tests/convert_secret_insert_to_static_insert/secret_inserts.mlir flakily fails (maybe 2/10 runs).

Here's the two outputs for the @insert_and_sum file:

  func.func @insert_and_sum(%arg0: !secret.secret<tensor<32xi16>>, %arg1: !secret.secret<index>) -> !secret.secret<tensor<32xi16>> {
    %c0_i16 = arith.constant 0 : i16
    %0 = secret.generic ins(%arg0, %arg1 : !secret.secret<tensor<32xi16>>, !secret.secret<index>) {
    ^bb0(%arg2: tensor<32xi16>, %arg3: index):
      %1:2 = affine.for %arg4 = 0 to 32 iter_args(%arg5 = %c0_i16, %arg6 = %arg2) -> (i16, tensor<32xi16>) {
        %extracted = tensor.extract %arg6[%arg4] : tensor<32xi16>
        %3 = arith.addi %arg5, %extracted : i16
        %4 = affine.for %arg7 = 0 to 32 iter_args(%arg8 = %arg6) -> (tensor<32xi16>) {
          %5 = arith.cmpi eq, %arg7, %arg3 : index
          %6 = affine.for %arg9 = 0 to 32 iter_args(%arg10 = %arg8) -> (tensor<32xi16>) {
            %8 = arith.cmpi eq, %arg9, %arg7 : index
            %inserted = tensor.insert %3 into %arg10[%arg9] : tensor<32xi16>
            %9 = scf.if %8 -> (tensor<32xi16>) {
              scf.yield %inserted : tensor<32xi16>
            } else {
              scf.yield %arg10 : tensor<32xi16>
            }
            affine.yield %9 : tensor<32xi16>
          }
          %7 = scf.if %5 -> (tensor<32xi16>) {
            scf.yield %6 : tensor<32xi16>
          } else {
            scf.yield %arg8 : tensor<32xi16>
          }
          affine.yield %7 : tensor<32xi16>
        }
        affine.yield %3, %4 : i16, tensor<32xi16>
      }
      %2 = affine.for %arg4 = 0 to 32 iter_args(%arg5 = %1#1) -> (tensor<32xi16>) {
        %3 = arith.cmpi eq, %arg4, %arg3 : index
        %inserted = tensor.insert %1#0 into %arg5[%arg4] : tensor<32xi16>
        %4 = scf.if %3 -> (tensor<32xi16>) {
          scf.yield %inserted : tensor<32xi16>
        } else {
          scf.yield %arg5 : tensor<32xi16>
        }
        affine.yield %4 : tensor<32xi16>
      }
      secret.yield %2 : tensor<32xi16>
    } -> !secret.secret<tensor<32xi16>>
    return %0 : !secret.secret<tensor<32xi16>>
  }

or

  func.func @insert_and_sum(%arg0: !secret.secret<tensor<32xi16>>, %arg1: !secret.secret<index>) -> !secret.secret<tensor<32xi16>> {
    %c0_i16 = arith.constant 0 : i16
    %0 = secret.generic ins(%arg0, %arg1 : !secret.secret<tensor<32xi16>>, !secret.secret<index>) {
    ^bb0(%arg2: tensor<32xi16>, %arg3: index):
      %1:2 = affine.for %arg4 = 0 to 32 iter_args(%arg5 = %c0_i16, %arg6 = %arg2) -> (i16, tensor<32xi16>) {
        %extracted = tensor.extract %arg6[%arg4] : tensor<32xi16>
        %3 = arith.addi %arg5, %extracted : i16
        %4 = affine.for %arg7 = 0 to 32 iter_args(%arg8 = %arg6) -> (tensor<32xi16>) {
          %5 = arith.cmpi eq, %arg7, %arg3 : index
          %inserted = tensor.insert %3 into %arg8[%arg7] : tensor<32xi16>
          %6 = scf.if %5 -> (tensor<32xi16>) {
            scf.yield %inserted : tensor<32xi16>
          } else {
            scf.yield %arg8 : tensor<32xi16>
          }
          affine.yield %6 : tensor<32xi16>
        }
        affine.yield %3, %4 : i16, tensor<32xi16>
      }
      %2 = affine.for %arg4 = 0 to 32 iter_args(%arg5 = %1#1) -> (tensor<32xi16>) {
        %3 = arith.cmpi eq, %arg4, %arg3 : index
        %inserted = tensor.insert %1#0 into %arg5[%arg4] : tensor<32xi16>
        %4 = scf.if %3 -> (tensor<32xi16>) {
          scf.yield %inserted : tensor<32xi16>
        } else {
          scf.yield %arg5 : tensor<32xi16>
        }
        affine.yield %4 : tensor<32xi16>
      }
      secret.yield %2 : tensor<32xi16>
    } -> !secret.secret<tensor<32xi16>>
    return %0 : !secret.secret<tensor<32xi16>>
  }

asraa avatar Sep 25 '24 15:09 asraa

When I remove the first function of that test, the test is no longer flakey!

asraa avatar Sep 25 '24 16:09 asraa

I'm guessing there's a secretness lattice issue then - somehow it's probably recalculating the secretness analysis when the other function is present to make the secretness of the static index var that's looped over back to secret

I found evidence by this by printing out when the secretness of the for op's induction variable was and wasn't secret, in the incorrect cases, it applied the transform again because the induction var was somehow secret.

asraa avatar Sep 25 '24 16:09 asraa

@asraa Do you have a fix in the works for this, or should I start looking at this?

AlexanderViand-Intel avatar Oct 02 '24 00:10 AlexanderViand-Intel

@asraa Do you have a fix in the works for this, or should I start looking at this?

My fix was to move this https://github.com/google/heir/blob/d58ccf8dde341803f69d2eef8fb796a19cba829e/lib/Transforms/ConvertSecretInsertToStaticInsert/ConvertSecretInsertToStaticInsert.cpp#L80 to the end of the pattern, thinking that the propogation for the later lines was perhaps interfering, but honestly, at some point I was unable to reproduce the flake without any fixes.

If you can reliably repro the flake, please check! I ran it for like 500 times and couldn't get a failure anymore.

asraa avatar Oct 02 '24 15:10 asraa

This issue has 1 outstanding TODOs:

This comment was autogenerated by todo-backlinks

github-actions[bot] avatar Oct 07 '24 14:10 github-actions[bot]