rellic
rellic copied to clipboard
Handle switch in CreateEdgeCond
In https://github.com/lifting-bits/rellic/blob/master/rellic/AST/GenerateAST.cpp#L135, SwitchInst
are not supported.
Compile the following with remill-clang-4.0 -emit-llvm -O3 -c -o example.bc
and decompile:
#include <stdint.h>
uint32_t target(uint32_t n) {
uint32_t mod = n % 4;
uint32_t result = 0;
if (mod == 0) {
result = (n | 0xbaaad0bf) * (2 ^ n);
} else if (mod == 1) {
result = (n & 0xbaaad0bf) * (3 + n);
} else if (mod == 2) {
result = (n ^ 0xbaaad0bf) * (4 | n);
} else {
result = (n + 0xbaaad0bf) * (5 & n);
}
return result;
}
You will see something similar to (instruction print was added):
F1110 21:21:03.700402 59636 GenerateAST.cpp:159] Unknown terminator instruction: switch
*** Check failure stack trace: ***
@ 0x1b4733d google::LogMessage::Fail()
@ 0x1b49834 google::LogMessage::SendToLog()
@ 0x1b46dbb google::LogMessage::Flush()
@ 0x1b4a459 google::LogMessageFatal::~LogMessageFatal()
@ 0x7c039d rellic::GenerateAST::CreateEdgeCond()
SIGABRT (Abort)
Would be willing to work on this, with some guidance.
Questioning the relevance of this -- is this tool just intended to decompile lifted bytecode (as produced by anvill), or would it be considered important to be able to lift switch statements as generated by the clang
compiler? (this is making the assumption that anvill and remill do not lift optimized switch statements, but LLVM optimizer could optimize to a switch statement -- is this correct?)
I think it is important to be able to handle as much of LLVM IR as possible. That means lifted bytecode (anvill, remill, mcsema) as well as bytecode compiled by clang. The readability of the decompiled output will of course vary wildly.
As far as switch statements in anvill or remill lifted bytecode go, I honestly don't know if there's cases where they'll appear.
Of course, any and all help is appreciated :)
Switch statements in LLVM IR will definitely show up.
Once we add jump table support to anvill, we will need to support switch
instructions. It's also possible that they will be synthesized without our knowledge by LLVM's optimizations. McSema already produces switch
instructions.
This has been partially solved by #106 . Keeping this open though.