python-idb
python-idb copied to clipboard
Coding style, transparency to idapython plugins and redundant code
Hi,
I always used python-idb for very quick stuff and it works great. However, I'd like to run some idapython scripts without having the IDA open/close bottleneck. I'm working on the implementation of some missing idaapi and idc routines but I need some clarifications before I open any pull request.
Constants coming from op_t
and exposed through the ida_ua
module, like o_near
, o_imm
, o_reg
and so on, can be accessed using both idaapi
and idc
... in other words, a hypothetical class ida_ua
would define o_imm = 5
, then inside class idaapi
we get o_imm = self.api.ida_ua.o_imm
. Same for class idc
.
This is just an example for a more general question. What is the coding rule for this kind of situation? If we mirror everything—almost 1:1—from IDAPython, then we achieve full transparency to IDAPython users and scripts at the price of redundant code. On the other hand, if we want the code to be a bit more polished, we might break compatibility with existing IDAPython scripts lying around.
Well, I'm slowly realising all this redundant disorder has been boosted one level further with the introduction of modern IDAPython modules. Maybe it's better to stick with the modern interface and expect that IDAPython scripts get "modernised" as well.
hey @invano
yeah, you've highlighted a dark corner of this library that i've avoided thinking about too much :-) i often find myself cringing while implementing the IDAPython API, because there are things i'd like to change, but can't for API compatibility (e.g. returning None vs raising exceptions, etc).
in any case, i think we should generally try to stick to "modern" IDAPython scripting, and primarily place constants within the "most specific module" (e.g. avoid declaring them in idc
). i'm ok with re-importing into idc
and friends for compatibility.
the existing code probably doesn't do this too well - sorry about that!
side note: crazy idea: we should be able to programmatically enumerate all constants across all IDAPython modules and use that to generate code. maybe this is better than manually importing constants as necessary.
Same holds for methods and so on.
I'm digging a bit inside IDAPython and it seems each module gets it own constants/methods, then there is idaapi
mostly performing as a huge wrapper around the inner modules and finally idc
, randomly wrapping stuff coming from idaapi
again...
Consider get_item_end(ea)
as a quick example, defined in ida_bytes
but also present in idaapi
and idc
:
Python>print idaapi.__dict__["get_item_end"]
<function get_item_end at 0x10c685c08>
Python>print ida_bytes.__dict__["get_item_end"]
<function get_item_end at 0x10c685c08>
Python>print idc.__dict__["get_item_end"]
<function get_item_end at 0x10e11cde8>
I agree that manually importing everything would be a mess. Your crazy idea makes actually sense to me!
About modern/ancient API and code, there is also the API transition to IDA >= 7.0 to consider (https://www.hex-rays.com/products/ida/7.0/docs/api70_porting_guide.shtml). How do you usually behave with this?
For example, I implemented ida_ua.print_operand()
and ida_ua.get_operand_type()
which were respectively called idc.GetOpnd()
and idc.GetOpType()
and are still exposed by idc
for compatibility. Or idc.find_func_end()
exposed also as idc.FindFuncEnd()
.
@invano I saw your code in https://github.com/williballenthin/python-idb/compare/master...bjchan9an:master Maybe it's ready for the pull request?