ScratchABlock icon indicating copy to clipboard operation
ScratchABlock copied to clipboard

"while" structring loses induction statements, was: "if" structuring doesn't work properly in tricky CFGs

Open maximumspatium opened this issue 6 years ago • 6 comments

SAB fails to decompile the following real-world code:

sub_201100:
$cr0.eq = (i32)$r4 == 0
$cr0.lt = (i32)$r4 < 0
$cr0.gt = (i32)$r4 > 0
$r11 = 0
if ($cr0.eq) goto loc_20114C

loc_20110C:
$ctr = $r4
$r12 = $r3 - 1
goto loc_201128

loc_201118:
if ($cr0.gt) goto loc_201124

loc_20111C:
$r0 = $r9 + 0x1
$r10 = $r0 & 0xff

loc_201124:
$r11 = $r10

loc_201128:
$ea = ($r12 + 0x1)
$r3 = *(u8*)$ea
$r12 = $ea
$r9 = $r3 + $r11
$r10 = $r9 & 0xff
$cr0.eq = (i32)$r10 == (i32)$r11
$cr0.lt = (i32)$r10 < (i32)$r11
$cr0.gt = (i32)$r10 > (i32)$r11
$ctr -= 1
if ($ctr != 0) goto loc_201118

loc_20113C:
if ($cr0.gt) goto loc_201148

loc_201140:
$r4 = $r9 + 0x1
$r10 = $r4 & 0xff

loc_201148:
$r11 = $r10

loc_20114C:
$r3 = $r11
return

It returns

AttributeError: 'REG' object has no attribute 'op'

After some debugging, I was able to spot the place of the failure: match_if is unable to negate the following condition:

edge('13','20') {
    'cond': COND($cr0.gt_0)
}

My bet is that $cr0.gt cannot be substituted because its value changes across basic blocks. So it's left as is. Finally, match_if() fails because it expects COND to be an expression, not REG.

Any idea how to fix that?

maximumspatium avatar Jun 06 '18 23:06 maximumspatium

Thanks for the report. I'm travelling most of this month, so not sure how soon I'll be able to look into details of this. So, usual questions would: a) complete stack trace; b) did you try to make a minimal reproduction testcase.

AttributeError: 'REG' object has no attribute 'op' if ($cr0.gt) goto loc_201148 https://github.com/pfalcon/ScratchABlock/blob/a1b5cce7e6b6269b8061195b13c678e50a4b19fc/decomp.py#L117-L118

Taking literally what you provide, it would mean that neg() doesn't know how to negate a REG. You can quickly check if that's true by:

if ($cr0.gt != 0) goto loc_201148

If that works as expected, then well, we need to teach neg() to be able to negate REGs like that. (It very well could be, because Xtensa I worked with is a classical RISC without flag register, so all conditions are indeed of the form $foo == bar or $foo != bar, etc.)

pfalcon avatar Jun 07 '18 21:06 pfalcon

if ($cr0.gt != 0) goto loc_201148

I commented the above mentioned line out but I still cannot get the code correctly decompiled. Give me a day or two to figure out what's going wrong and to prepare a test case.

maximumspatium avatar Jul 09 '18 12:07 maximumspatium

Give me a day or two to figure out what's going wrong and to prepare a test case.

Sounds good. I'm also overbusy with real life and didn't have a chance to look at it, but slowly coming back to hacking now (looking at other projects so far), so hope to get to it in few weeks too.

pfalcon avatar Jul 13 '18 20:07 pfalcon

Sorry for the delay with looking into this. So, fixing the crash for trivial: 9e2e8861a62a9c1357a7247d9ee7bd7f0e60e748

But then structuring pass gives out rather nonsense. It detects while loop, even though to naked eye it's do-while with goto inside (so, maybe it's indeed optimized while, and SAB is smart!), but loses control variable of the loop, $ctr.

So, I'm not closing this, but changing title, will be looking further.

pfalcon avatar Nov 05 '18 19:11 pfalcon

Oh, forgot suggestions:

  1. Having the example reproduction steps (command line, ...) can only help.
  2. Please-please add "addresses" to your inputs, this can be just line numbers/whatever, just make them explicit. Otherwise cross-referencing results of passes and original code is a pain.

pfalcon avatar Nov 05 '18 20:11 pfalcon

And to capture the current output I get:

.ENTRY:
// $r3 = $r3_0 // {'uses': ['10']} (dead)
// $r4 = $r4_0 // {'uses': ['2', '3', '4', '9']} (dead)
// $cr0.eq = (i32)$r4_0 == 0 // {'uses': ['6']} (dead)
// $cr0.lt = (i32)$r4_0 < 0 // {'uses': []} (dead)
// $cr0.gt = (i32)$r4_0 > 0 // {'uses': []} (dead)
$r11 = 0 // {'uses': ['27', '29', '30', '31', '46']}
if ((i32)$r4_0 != 0) {
  $ctr = $r4_0 // {'uses': ['32']}
  $r12 = $r3_0 - 1 // {'uses': ['24']}
  while ($ctr != 0) {
    if ($cr0.gt == 0) {
      // $r0 = $r3 + $r11 + 0x1 // {'uses': ['18']} (dead)
      $r10 = ($r3 + $r11 + 0x1) & 0xff // {'uses': ['21']}
    }
    $r11 = $r10 // {'uses': ['27', '29', '30', '31']}
  }
  if ($cr0.gt == 0) {
    // $r4 = $r3 + $r11 + 0x1 // {'uses': ['40']} (dead)
    $r10 = ($r3 + $r11 + 0x1) & 0xff // {'uses': ['43']}
  }
  $r11 = $r10 // {'uses': ['46']}
}
// $r3 = $r11 // {'uses': []} (dead)

pfalcon avatar Nov 05 '18 20:11 pfalcon