binaryninja-api icon indicating copy to clipboard operation
binaryninja-api copied to clipboard

Issue with soundness of HLIL control flow structuring

Open plafosse opened this issue 1 year ago • 2 comments

HLIL can produce unsound control flow structuring in some conditions.

Consider this MLIL code:

orig_state_exec.zip

In MLIL everything looks correct: image

Consider the case when 'i = 0the path meansiis assigned to5and then ultimately goes to instruction21`

Now in HLIL: image The control flow is a series of if statements rather than if-else statements. In the case of i == 0 it meets the first condition and sets i = 5 and then can also satisfy the second condition too incorrectly setting var_20 = 1

Special Thanks to: Zao Yang and Stefan Nagy for their research in Decompiler Fuzzing for reporting this issue.

plafosse avatar Mar 20 '24 17:03 plafosse

This is exposing a fundamental issue in control flow restructuring. We have desire to revisit the control flow restructuring algorithm to significantly reduce the number of conditions that get out of hand and become unreadable, which is also an issue with the current algorithm. This bug will probably have to be fixed alongside a refactor of the control flow restructuring itself.

D0ntPanic avatar Apr 10 '24 19:04 D0ntPanic

A user reported a similar case:

  • BN version: 4.0.4958 Windows
  • Compiler: MSVC 2023 x64

Compile the following C code using cl /c /O2 test.c and disassemble the resulting binary in Binary Ninja:

bool test(int *input)
{
    bool result;
    if(input[0] == 1) {
        result = true;
        input[0] = 2;
        if(input[1] == 1)
            goto other;
    } else {
other:
        result = false;
    }
    return result;
}

This results in the following High Level IL output:

 int64_t test(int32_t* arg1) {
1    int64_t rax;
2    if (*arg1 == 1)
3        rax = 1;
4        *arg1 = 2;
5    if (*arg1 != 1 || (*arg1 == 1 && arg1[1] == 1))
6        rax.b = 0;
7    return rax;
 }

However, this output is not actually correct: if the function's input is { 1, 0 }, at line 4 in the HLIL, arg1[0] will be set to 2. Then, at line 5, the condition in the if statement evaluates to true (since arg1[0] now evaluates to 2), which in turn leads to line 6 being executed, and the whole function returning false. This differs from the actual behavior of the program (which should be returning true).

I see the fundamental issue is the body of the if statement is modifying the variables used in the condition, so it alters the result

xusheng6 avatar Apr 30 '24 14:04 xusheng6

Fixed in 4.1.5547

D0ntPanic avatar Jun 19 '24 17:06 D0ntPanic