ghidra icon indicating copy to clipboard operation
ghidra copied to clipboard

Don't change conditions to equivalent expressions during decompilation

Open pglizniewicz opened this issue 6 years ago • 10 comments

Changing conditions to equivalent ones generates a decompiler output which is harder to understand. For example this:

                         LAB_6000207c
    6000207c 83 f9 08        CMP        ECX,0x8
    6000207f 7f 0f           JG         LAB_60002090
    60002081 83 fa 08        CMP        EDX,0x8
    60002084 7e 05           JLE        LAB_6000208b
    60002086 b8 10 27        MOV        result,0x2710
             00 00
                         LAB_6000208b
    6000208b 83 f9 08        CMP        ECX,0x8
    6000208e 7e 08           JLE        LAB_60002098
                         LAB_60002090
    60002090 83 fa 08        CMP        EDX,0x8
    60002093 7f 03           JG         LAB_60002098
    60002095 c1 e0 02        SHL        result,0x2

was decompiled to:

if (iVar1 < 9) {
  if (8 < bpp) {
    result = 10000;
  }
  if (iVar1 < 9) {
    return result;
  }
}
if (bpp < 9) {
  result = result << 2;
}

}

In assembly, the variable was compared to the constant using less-equal, but in the decompile it's compared to the constant+1 using less. It's correct, but the resulting decompile is harder to understand. This code dealt with setting bit-depth of the screen and while a 8-bit bit-depth was meaningful, 9-bit bit-depth feels out of place in this context.

Ghidra should either use the same operation as in assembly or allow us to choose it via something like the "Convert" option for constants in the "Listing" view.

  • Ghidra Version: 9.0.2

pglizniewicz avatar May 03 '19 12:05 pglizniewicz

As another point on clarity/consistency, similar compare instructions decompile into expressions with differing ordering of their operands. In the above example, the first and last CMP EDX have swapped operands in the decompile - one puts the constant on the left of the expression, the other on the right.

There are some that believe any constant of an "if" expression should be on the left, as it helps catch accidental use of assignment instead of equality. But either way, the decompiled output above is inconsistent.

(Figured it might be better to camp onto this bug instead of essentially duplicating most of it.)

jtwine avatar May 04 '19 18:05 jtwine

Here is a similar issue. Consider the following: typedef struct { unsigned int n; unsigned int **r; }S; extern void init(unsigned int *x); void foo(S *p) { int i; for(i=0; i<p->n; i++) init(p->r[i]); } On Ubuntu 16.04 X86 64 bit, gcc 5.4 -O3 turns the loop into (this from objdump): 18: 49 8b 44 24 08 mov 0x8(%r12),%rax 1d: 83 c3 01 add $0x1,%ebx 20: 48 8b 3c 28 mov (%rax,%rbp,1),%rdi 24: 48 83 c5 08 add $0x8,%rbp 28: e8 00 00 00 00 callq 2d <foo+0x2d> 2d: 41 39 1c 24 cmp %ebx,(%r12) 31: 77 e5 ja 18 <foo+0x18> (The compiler turns the loop into if(p->n!=0)do{...}while(i<p->n);) The Ghidra 9.0.1 decompiler turns the condition of the test/branch at the bottom of the loop into uVar2 <= *puParm1 && *puParm1 != uVar2 (I've dropped the casts.) This amounts to turning the loop condition into i<=p->n && p->n != i. (Here uVar2 is i and puParam1 is p.)

I had to stare at this for a while to convince myself that it was correct.

  • Ghidra 9.0.1

aamcintosh avatar May 22 '19 21:05 aamcintosh

I think a better way to handle this is to allow interactively inverting the if conditions.

MrSapps avatar Jun 11 '19 11:06 MrSapps

There are some that believe any constant of an "if" expression should be on the left, as it helps catch accidental use of assignment instead of equality. But either way, the decompiled output above is inconsistent.

I think this should never be done, any modern compiler flags this as an error. Yoda conditions are less readable. Also since this is decompiler output it won't get it wrong anyway...

MrSapps avatar Jun 14 '19 14:06 MrSapps

I agree, please change -1 < voltage into voltage > -1

gyohng avatar May 14 '24 00:05 gyohng

I agree that this is still an issue. (Ghidra 11.1)

tahg avatar Sep 10 '24 14:09 tahg

Note that the "constant appears on the left side of comparisons" part of this issue will be fixed if/when #4339 is merged

LukeSerne avatar Jan 26 '25 15:01 LukeSerne

As of 11.3 the problem of the decompiler changing comparisons with a precise into, e.g., "if(constant - 1 < variable)" instead of "if (constant <= variable)" or the flipped formats with > and >= still exists.

The decompiler IMHO should never ever show an artificially modified constant in the comparison.

The modification causes a real display problem for enum types. You get the wrong enum displayed. And as the decompiler can only decode bitfields when using enum types, you even go from a 2^n constant to 2^n-1 in the source, which then gets decoded into a wild enum bit mix that has nothing to do with the proper 2^n bit.

Is there a way to fix this to never ever do the "-1" or "+1" and properly use the <= or >= at the right time instead of just < or >? In other words, preserve the constant value and adjust the comparison operation shown instead?

hwrobel avatar Mar 03 '25 20:03 hwrobel