retdec
retdec copied to clipboard
[bin2llvmir] AARCH64 C++ Class constructor generates @__decompiler_undefined_function_0
when running a retdec-decompiler.py --no-memory-limit --backend-no-opts on a ELF binary built for AARCH64/FreeBSD, I get several @__decompiler_undefined_function_0 being injected at the beginning of class functions
File format : ELF
File class : 64-bit
File type : DLL
Architecture : ARM AARCH64
Endianness : Little endian
Entry point address : 0
Entry point offset : 0x788
Entry point section name : .text
Entry point section index: 1
Bytes on entry point : 00000000080000004d4f443030bdc600f85fc70038d2311178e8a90054b9ad00f85fc700f30f1ef8fd7b01a9fd430091ffc3
Detected tool : gc (compiler), 33 from 191 significant nibbles (17.2775%)
Here is a disassembled C++ class constructor :
movn x8, #0xffff, lsl #16
movk x8, #0x5c0
stp x8, xzr, [x0]
str wzr, [x0, #0x10]
ret
That seems to perform some variable initializations, x0 is supposed to be the object location, hence the stores relative to it. But on the LLVM IR I get :
%0 = call i64 @__decompiler_undefined_function_0()
%1 = inttoptr i64 %0 to i64*, !insn.addr !482175
store i64 -4294965824, i64* %1, align 8, !insn.addr !482175
%2 = add i64 %0, 8, !insn.addr !482175
%3 = inttoptr i64 %2 to i64*, !insn.addr !482175
store i64 0, i64* %3, align 8, !insn.addr !482175
%4 = add i64 %0, 16, !insn.addr !482176
%5 = inttoptr i64 %4 to i32*, !insn.addr !482176
store i32 0, i32* %5, align 4, !insn.addr !482176
ret i64 %0, !insn.addr !482177
Which get transformed in C :
int64_t result; // 0x48465c
// 0x48465c
*(int64_t *)result = -0xfffffa40;
*(int64_t *)(result + 8) = 0;
*(int32_t *)(result + 16) = 0;
return result;
It seems to me retdec can't detect the reference to the object and uses absolute offsets instead. I'm quite new to the project, so I'm not sure where to start (and I could have missed something in the setup 😉 ) What do you reckon ?
Similar to #750.
__decompiler_undefined_function_<N> will be properly removed when I merge a branch I'm working on.
As for the other stuff. Are these functions the only problem, or do you think there is some discrepancy between produced C and the orginal ASM? If so, can you provide an input binary so I can have a closer look?
Actually I don't believe the problem lies in llvmir2hll but more in bin2llvmir, but you're right there seems to be more than one issue. Here is a reproduction on a simpler binary : undefined_function.zip From source code
class A {
public:
int _a1;
int _a2;
A() __attribute__ ((noinline)) {
_a1=1;
_a2=2;
};
int get() __attribute__ ((noinline)) {
return _a1;
}
};
int main() {
A a;
return a.get();
}
clang++ -g -fPIE -fexceptions -fuse-ld=lld -O3 -mtune=cortex-a53 -target aarch64-none-linux-gnu -nostdlib -nostdlibinc -Wno-unused-command-line-argument -std=c++17 -nodefaultlibs -nostdinc++ -o undefined_function undefined_function.cpp
If you look at A::get lifted IR, you'll see that's it's not accurate. By playing around with bin2llvmir parameters I could get it to work by deactivating the following options :
-value-protect, was introducing undefined_functions-inst-opt-rda, was removing function parameters-constants, was skipping aload
If you look at A::get lifted IR, you'll see that's it's not accurate. By playing around with bin2llvmir parameters I could get it to work by deactivating the following options :
-value-protect, was introducing undefined_functions-inst-opt-rda, was removing function parameters-constants, was skipping aload
Worked perfectly, thanks a lot!
I had to go to bin/retdec-config.py to edit the setting.