binaryninja-api
binaryninja-api copied to clipboard
Rust: refactor MLIL and HLIL instructions
This took like a week but the end result I think is a lot more idiomatic and concise.
Here's a summary of changes:
- Change
MediumLevelILInstruction
into a struct withfunction
,address
, andkind
fields. This factors the function and address fields out of the enum variants. The same thing applies toHighLevelILInstruction
. - Implement lifting inside the
{Medium,High}LevelILInstruction::lift
method rather than deferring tolift
methods on each variant. - Remove field getters on each variant and make their fields public. The motivation here is that these getters would perform lifting of their corresponding field, which means using them invokes a kind of "partial" lifting which feels weird. Better to just go through the
{Medium,High}LevelILLiftedInstruction
structs.- Some of the fields were tuples that I expanded out into named fields. For example, list metadata fields included the index of the first element as well as the length. I did my best to give some good names, sticking to the
num_*
andfirst_*
template, but probably some could be named something better.
- Some of the fields were tuples that I expanded out into named fields. For example, list metadata fields included the index of the first element as well as the length. I did my best to give some good names, sticking to the
- Move the
operands
method onto the lifted instructions, as it called the now-removed field getters. Also, it returns an ownedVec
of operands, each of which also own their data. This introduces some extra clones but seems a bit easier to handle for users, and avoids exposing the customOperandIter
type publicly. - Instantiate instructions directly from the list of raw operands rather than using
new
methods. Then, delete the constructors. In the end, this letsoperation.rs
just be a big collection of struct definitions. - The
OperandList
type was duplicated between MLIL and HLIL, so I made a generic version which can hold either type of function. Also, the implementation ofExactSizeIterator
for the type was wrong because the value ofself.remaining
was being decreased 4 at a time and would be out of sync with the actual number of values given out.
Some miscellaneous changes:
- Rename the
ConstantData
instruction variants to justConstData
to avoid confusion with thetypes::ConstantData
struct. - Add
LiftedJump
andLiftedLabel
types for HLIL. These were removed halfway through the original HLIL PR but I don't know why.
One solution is to remove the non-lifted versions and just have a single type of instruction. Having a single Kind
enum then won't be much of an issue. It seemed in the original PR there was a want for an easy and fast version of the IL instructions (hence the non-lifted versions, which also implement Copy
), so it's worth considering how much of a performance tradeoff working with just lifted instructions really is. Some of the variants are recursive and store whole instructions within them, and I don't know how large a single instruction can get, or how many sub-nodes it can contain. However, it's worth noting that the Python bindings only deal with fully-lifted instructions.
As for your first concern, the HighLevelILInstruction::new
constructor is pub(crate)
and so the change to take Ref
directly is purely internal. Is this what you're referring to, or are there places where calling .to_owned()
is now necessary?
EDIT: Also worth noting that HighLevelILOperand
no longer exists, it's only HighLevelILLiftedOperand
now (likewise for MLIL). I debated keeping the name the same, but felt there should be a clear separation between the "raw" unlifted instructions and the fully lifted ones.
I missed that HighLevelILInstruction::new
is crate-internal. You can disregard that comment. As for removing Lifted
or non-lifted, perhaps we should reconsider that later. This PR should be good as-is regardless if we decide to cull one of those later.
I'll leave it to @plafosse to decide if we should merge this before code freeze.
Let me know if I should squash the extra bugfix commits.
Thank you again for this awesome PR! Sorry for the delay on getting it merged.