mcsema
mcsema copied to clipboard
C++ export variables
Problem: If I use mcsema to re-compile C++ programs, the re-compiled binary crashes at the start of ‘main’ function. I found that you have moved export variables from ‘.bss’ segment to another segment and initialize the export variables, but the original program still refers the original variables in ‘.bss’ segment. Thus it crashes because the variable is not initialized. I think we can fix it by making the recompiled program refer the real export variable. Do you have any suggestion?
In new binary, it crashed at 0x42D4D3 while read 0x692970. But 0x692970 hasn’t been initialized. In original binary, the relative address is 0x608570 and it will be initialized right.
And this it the bitcode file. bitcode_file.zip
I find that 0x692970 should be 0x692320+240(std::cout + 240). There are too many reference to 0x692970 and I cannot patch it by hand.
I tried to change get_cfg.py. I add true struct with true size of the export varibles, and attach it to the export variables. Then it might automatically recognize the export variables. I make it. But re-compiled binary crashed as usual.
I have another three ideas to solve this problem. First, every time I come across an address, I judge whether it is within an export variable struct like ‘cout’. Second, can you make external variable and internal variable together? Thus export variable still point to ‘bss’ segment and it can be initialized right. Third, can you only re-compile just a function but not the whole file?
So the export issue you're talking about has been a thing that we've encountered before. I saw @pgoodman already tagged @kumarak on this; maybe they can remember what our resolution was last time. I know we had this problem with something as simple as a C++-based hello world
but somehow it got resolved.
Speaking of hello world, does a simple std::cout << "Hello World\n";
lift for you?
What CFG recovery frontend are you using? IDA?
As for the size of export variables: That was also something we've encountered in the past and I thought fixed, but clearly there are still issues.
The issue is with the reference of std::cout
. IDA is not able to correctly resolve the reference of 0x608570 (std::cout + 0xF0
). Instead, it is lifted as the part of bss
seg variable. The bss
seg variable is also not lazily initialized which is causing the crash.
04018C6 sub rsp, 28h
04018CA mov rdi, cs:qword_608570
04018D1 mov rax, [rdi]
So the export issue you're talking about has been a thing that we've encountered before. I saw @pgoodman already tagged @kumarak on this; maybe they can remember what our resolution was last time. I know we had this problem with something as simple as a C++-based
hello world
but somehow it got resolved.Speaking of hello world, does a simple
std::cout << "Hello World\n";
lift for you?What CFG recovery frontend are you using? IDA?
As for the size of export variables: That was also something we've encountered in the past and I thought fixed, but clearly there are still issues.
So the export issue you're talking about has been a thing that we've encountered before. I saw @pgoodman already tagged @kumarak on this; maybe they can remember what our resolution was last time. I know we had this problem with something as simple as a C++-based
hello world
but somehow it got resolved.Speaking of hello world, does a simple
std::cout << "Hello World\n";
lift for you?What CFG recovery frontend are you using? IDA?
As for the size of export variables: That was also something we've encountered in the past and I thought fixed, but clearly there are still issues.
I use IDA7.2.
The issue is with the reference of
std::cout
. IDA is not able to correctly resolve the reference of 0x608570 (std::cout + 0xF0
). Instead, it is lifted as the part ofbss
seg variable. Thebss
seg variable is also not lazily initialized which is causing the crash.04018C6 sub rsp, 28h 04018CA mov rdi, cs:qword_608570 04018D1 mov rax, [rdi]
Yes, this is what happened as you said. I have another three ideas to solve this problem. First, every time I come across an address, I judge whether it is within an export variable struct like ‘cout’. Second, can you make external variable and internal variable together? Thus export variable still point to ‘bss’ segment and it can be initialized right. Third, can you only re-compile just a function but not the whole file? I am still working on analyzing the source code "get_cfg.py".
@bsauce, A better solution will be to resolve the address to the export variable as you see them. Having multiple copies of the same variable (initializing bss with the exported variable) and accessing it differently might go wrong if they are referred indirectly and through the different offset.