anvill
anvill copied to clipboard
anvill forges beautiful LLVM bitcode out of raw machine code
At least with x86, when we see code using the `fsbase` or `gsbase`, we (as of #216) lift using LLVM's address space feature. This ends up producing code that looks...
We need a simple tool to do basic sanity checking of emitted JSON specs. Things like: * Register names are sane. * Addresses match expected ranges * All known, named...
There were some issues with flattening GEP's for structs, vector and array types. I think it was a straightforward issue of checking the type in some other visitor functions, but...
Our binary package generation defaults to a specific Python version. Is there a way to make it more generic? It's fine if there is not, in that case lets officially...
Want to be able to distinguish between unsigned and signed integers when translating an LLVM type to spec. This is nontrivial because LLVM doesn't support the notion of signedness in...
The ELF thunk recognition code of McSema should be copied and adapted for Anvill so that if a function references an ELF thunk, then we go and follow through and...
Also, basic block addresses can be observed. Also, this function uses a `retn`, which is not recognized correctly in IDA. ```json {"functions": [{"return_stack_pointer": {"register": "ESP", "type": "I", "offset": 4}, "return_values":...
These should be able to be populated both from LLVM + debug metadata, as well as DWARF parsing, as well as from IDA's stuff.
E.g. for teh SysV ABI, this is often in `al`.
So if the same address has multiple names, then the second call will declare a new version of the thing. For global variables, I thnk you'd want to use a...