capa
capa copied to clipboard
IDA backend: include api features of renamed calls
Summary
After renaming dynamic calls in IDA, the backend should emit the respective API calls.
Most often the dynamic calls will be to global addresses, but calls via registers or even local variables are also possible.
Motivation
Malware often contains obfuscated API calls as shown below.
Before
After
Describe alternatives you've considered
Alternatively, an additional analysis engine could try to automatically recover/identify API calls, e.g. using emulation. This could then work in the standalone version as well.
+1 for this feature. I'm analyzing some shellcode that dynamically resolves all API calls and stores the addresses in a large structure. We should investigate solutions to include these sorts of API calls as well.
maybe can adapt code here -> https://github.com/arizvisa/ida-minsc/blob/master/base/instruction.py#L819 to extract API calls from user-defined structures.
After doing additional research it appears that attempting to pull structure member names from a user-defined structure using IDAPython can get messy real quick - especially when working with large, nested structures.
A simpler solution may be to check if a call
/jmp
references a structure offset using something like idaapi.is_stroff
and then parse the structure member using a regular expression, something like (call|jmp)\s+\[.+\s*\+\s*(.+)\]
.
I don't like the idea of parsing the disassembly like this but it's the simplest solution and may require less overhead then attempting to go through IDAPython.
Another problem to tackle - how do we map something like GetMessageW
to user32.GetMessageW
? Most capa rules specify the DLL name as part of the rule which means we can't match with just GetMessageW
.
One option is to only support specific annotation formats e.g. WIN.api.user32_GetMessageW