reko icon indicating copy to clipboard operation
reko copied to clipboard

C-Output: return expressions directly

Open rfalke opened this issue 5 years ago • 5 comments

Subject: https://github.com/rfalke/decompiler-subjects/tree/master/from_holdec/dmi/exp/ia32_elf

Output

word32 advanced_1_boolean_minization(uint32 dwArg04, ui32 dwArg08, word32 dwArg0C)
{
	word32 eax_25;
	if (dwArg04 == 0x00 && (dwArg08 == 0x00 && dwArg0C != 0x00) || (dwArg04 == 0x00 && (dwArg08 != 0x00 && dwArg0C != 0x00) || dwArg04 != 0x00 && dwArg08 == 0x00))
		eax_25 = 0x01;
	else
		eax_25 = 0x00;
	return eax_25;
}

Expected:

word32 advanced_1_boolean_minization(uint32 dwArg04, ui32 dwArg08, word32 dwArg0C)
{
	return (dwArg04 == 0x00 && (dwArg08 == 0x00 && dwArg0C != 0x00) || (dwArg04 == 0x00 && (dwArg08 != 0x00 && dwArg0C != 0x00) || dwArg04 != 0x00 && dwArg08 == 0x00));
}

Maybe it is also an improvement to replace "0x00" with "0" to reduce the output size.

rfalke avatar Jun 30 '20 08:06 rfalke

I prefer

word32 advanced_1_boolean_minization(uint32 dwArg04, ui32 dwArg08, word32 dwArg0C)
{
	if (dwArg04 == 0x00 && (dwArg08 == 0x00 && dwArg0C != 0x00) || (dwArg04 == 0x00 && (dwArg08 != 0x00 && dwArg0C != 0x00) || dwArg04 != 0x00 && dwArg08 == 0x00))
		return 1;
	else
		return 0;
}

in this case.

ptomin avatar Jul 05 '20 17:07 ptomin

Because this case has a larger expression or in general?

So what about a simpler case?

Current output:

word32 intermediate_1_ternary_blocks(Eq_11 dwArg04)
{
        word32 eax_10;
        if (dwArg04 == 0x2A)
                eax_10 = 0x00;
        else
                eax_10 = 0x01;
        return eax_10;
}

I would prefer:

word32 intermediate_1_ternary_blocks(Eq_11 dwArg04)
{
        return dwArg04 != 0x2A;
}

or

word32 intermediate_1_ternary_blocks(Eq_11 dwArg04)
{
        return dwArg04 != 0x2A ? 1 : 0;
}

or

word32 intermediate_1_ternary_blocks(Eq_11 dwArg04)
{
        return _bool_to_int(dwArg04 != 0x2A);
}

The main improvement IMHO is that you can see that both branches of the if-else assign to the same variable. There may also be a language which allows something like the following which also would capture this (but it is still longer than the others above and leaves the well known C paths):

word32 intermediate_1_ternary_blocks(Eq_11 dwArg04)
{
        word32 eax_10 = (if (dwArg04 == 0x2A) 0x00 else 0x01);
        return eax_10;
}

rfalke avatar Jul 07 '20 06:07 rfalke

I think decompiled code should be as close to original as possible. So I would prefer

word32 intermediate_1_ternary_blocks(Eq_11 dwArg04)
{
        if (dwArg04 == 0x2A)
                retutn 0;
        else
                return 1;
}

if there is conditional jump in assembly code and

word32 intermediate_1_ternary_blocks(Eq_11 dwArg04)
{
        return dwArg04 != 0x2A;
}

if there is no jumps.

ptomin avatar Jul 09 '20 17:07 ptomin

So for you "original" means the assembler. I would argue that different compilers translate the same source code to different assembler instructions and I would like to read an output code which doesn't have this variants. This would make the output easier to read. So something like "same semantics should result in the same output code".

rfalke avatar Jul 10 '20 09:07 rfalke

Issue is still in version 0.11.4.0-931ca7d.

rfalke avatar Aug 26 '23 21:08 rfalke