mcsema icon indicating copy to clipboard operation
mcsema copied to clipboard

C++ export variables

Open bsauce opened this issue 5 years ago • 7 comments

Problem: If I use mcsema to re-compile C++ programs, the re-compiled binary crashes at the start of ‘main’ function. I found that you have moved export variables from ‘.bss’ segment to another segment and initialize the export variables, but the original program still refers the original variables in ‘.bss’ segment. Thus it crashes because the variable is not initialized. I think we can fix it by making the recompiled program refer the real export variable. Do you have any suggestion?

bsauce avatar Jul 26 '19 04:07 bsauce

crashed_binary.zip

In new binary, it crashed at 0x42D4D3 while read 0x692970. But 0x692970 hasn’t been initialized. In original binary, the relative address is 0x608570 and it will be initialized right.

And this it the bitcode file. bitcode_file.zip

I find that 0x692970 should be 0x692320+240(std::cout + 240). There are too many reference to 0x692970 and I cannot patch it by hand.

bsauce avatar Jul 26 '19 04:07 bsauce

I tried to change get_cfg.py. I add true struct with true size of the export varibles, and attach it to the export variables. Then it might automatically recognize the export variables. I make it. But re-compiled binary crashed as usual.

get_cfg_new.zip

I have another three ideas to solve this problem. First, every time I come across an address, I judge whether it is within an export variable struct like ‘cout’. Second, can you make external variable and internal variable together? Thus export variable still point to ‘bss’ segment and it can be initialized right. Third, can you only re-compile just a function but not the whole file?

bsauce avatar Jul 30 '19 03:07 bsauce

So the export issue you're talking about has been a thing that we've encountered before. I saw @pgoodman already tagged @kumarak on this; maybe they can remember what our resolution was last time. I know we had this problem with something as simple as a C++-based hello world but somehow it got resolved.

Speaking of hello world, does a simple std::cout << "Hello World\n"; lift for you?

What CFG recovery frontend are you using? IDA?

As for the size of export variables: That was also something we've encountered in the past and I thought fixed, but clearly there are still issues.

artemdinaburg avatar Jul 30 '19 14:07 artemdinaburg

The issue is with the reference of std::cout. IDA is not able to correctly resolve the reference of 0x608570 (std::cout + 0xF0). Instead, it is lifted as the part of bss seg variable. The bss seg variable is also not lazily initialized which is causing the crash.

04018C6                 sub     rsp, 28h
04018CA                 mov     rdi, cs:qword_608570
04018D1                 mov     rax, [rdi]

kumarak avatar Jul 30 '19 16:07 kumarak

So the export issue you're talking about has been a thing that we've encountered before. I saw @pgoodman already tagged @kumarak on this; maybe they can remember what our resolution was last time. I know we had this problem with something as simple as a C++-based hello world but somehow it got resolved.

Speaking of hello world, does a simple std::cout << "Hello World\n"; lift for you?

What CFG recovery frontend are you using? IDA?

As for the size of export variables: That was also something we've encountered in the past and I thought fixed, but clearly there are still issues.

So the export issue you're talking about has been a thing that we've encountered before. I saw @pgoodman already tagged @kumarak on this; maybe they can remember what our resolution was last time. I know we had this problem with something as simple as a C++-based hello world but somehow it got resolved.

Speaking of hello world, does a simple std::cout << "Hello World\n"; lift for you?

What CFG recovery frontend are you using? IDA?

As for the size of export variables: That was also something we've encountered in the past and I thought fixed, but clearly there are still issues.

I use IDA7.2.

bsauce avatar Jul 31 '19 01:07 bsauce

The issue is with the reference of std::cout. IDA is not able to correctly resolve the reference of 0x608570 (std::cout + 0xF0). Instead, it is lifted as the part of bss seg variable. The bss seg variable is also not lazily initialized which is causing the crash.

04018C6                 sub     rsp, 28h
04018CA                 mov     rdi, cs:qword_608570
04018D1                 mov     rax, [rdi]

Yes, this is what happened as you said. I have another three ideas to solve this problem. First, every time I come across an address, I judge whether it is within an export variable struct like ‘cout’. Second, can you make external variable and internal variable together? Thus export variable still point to ‘bss’ segment and it can be initialized right. Third, can you only re-compile just a function but not the whole file? I am still working on analyzing the source code "get_cfg.py".

bsauce avatar Jul 31 '19 01:07 bsauce

@bsauce, A better solution will be to resolve the address to the export variable as you see them. Having multiple copies of the same variable (initializing bss with the exported variable) and accessing it differently might go wrong if they are referred indirectly and through the different offset.

kumarak avatar Jul 31 '19 22:07 kumarak