miasm
miasm copied to clipboard
Simplification passes for ExprAssign
Hi!
I'm preparing a PR. For this, I have to apply simplification rules for ExprAssign
which has to perform different transformation for src
and dst
.
ira_cfg = ira.new_ircfg_from_asmcfg(asm_cfg)
ira_cfg.simplify(expr_simp_high_to_explicit)
expr_simp = ExpressionSimplifier ()
expr_simp.enable_passes({ExprAssign: [my_simp]})
ira_cfg.simplify(expr_simp)
Since simplify
from AssignBlock
applies the same operation to src
and dst
:
for dst, src in viewitems(self):
if dst == src:
continue
new_src = simplifier(src)
new_dst = simplifier(dst)
I patched it as follows:
for dst, src in viewitems(self):
if my_simp_flag:
e = self.dst2ExprAssign(dst)
rewritten = simplifier(e)
new_src = rewritten.src
new_dst = rewritten.dst
else:
if dst == src:
continue
new_src = simplifier(src)
new_dst = simplifier(dst)
Obviously, this is not how it should be done. What do you think would be a good way to apply this?
Hello,
Hum, for now, I would rather explictely call the simplifier in your script, instead of modifying AssignBlock.simplify
in Miasm.
We have this kind of code in several place in Miasm, like here: https://github.com/cea-sec/miasm/blob/master/miasm/analysis/outofssa.py#L383
Hi!
I could do this, but in this case I think I would also have to modify/recreate all Assign/IR blocks in ira_cfg
manually, if I want to perform further analysis ?
I am currently looking for a clean solution since I'm planning to introduce a new PR that requires performing simplifications of ExprAssign
on the graph level in a first step before applying SSA.
Ok. Just some remarks:
For us, the ExprAssign is a weird word in Miasm, as it's more a statement than a right/left value.
But as it belongs to the Expr
class for now, maybe we can consider that expression simplifier can deal with it:
The patch for this may be a little one: the case ExprAssign
has to be added in the expression simplifier cases.
But it may trigger some new behavior:
- For now, the
replace_expr
is coded as a visitor. If we do areplace_expr
on it, both left value and right value will be modified by the replace_expr. The problem here is that we may want to only replace sources of the expressions. If we do constant propagation for example let say in:
@32[EAX] = @32[EAX] + 1
and let say we have concluded in a previous analysis that @32[EAX]
can be replaced by 0x1337BEEF
. Here we clearly want that the replace_expr
on the ExprAssign
gives:
@32[EAX] = 0x1337BEEF + 1
and not:
0x1337BEEF = 0x1337BEEF + 1
So the conclusion maybe that we may have to
- modify the
replace_expr
- or add a new api, kind of:
replace_righ_values
andreplace_left_values
which take this problem into account.
For me I think we have to take this problem into account and maybe the second solution is the good for. Today, we are using replace_expr
and try to twist it's behavior to match our goal but the real solution should be to have explicit and clear APIs for this. Also, It will make clearer what in Miasm is a right/left value, which seems a good point to me :smile:
What do you think about this?
Hi!
I think both approaches have advantges and disadvantages. On the short term, introducing replace_right_values
and replace_left_values
seems for sure way more feasible (perhaps simplify_lhs
and simplify_rhs
are better wordings?). However, on the long term this is not the most ideal solution in terms of clean code and unnecessary computations.
Lets take for instance the following:
ira_cfg.simplify_lhs(expr_simp_lhs)
ira_cfg.simplify_rhs(expr_simp_rhs)
Lets assume simplify_lhs
and simplify_rhs
look as follows:
def simplify_lhs(self, simplifier):
"""
Return a new AssignBlock with expression simplified
@simplifier: ExpressionSimplifier instance
"""
new_assignblk = {}
for dst, src in viewitems(self):
new_dst = simplifier(dst)
new_assignblk[new_dst] = src
return AssignBlock(irs=new_assignblk, instr=self.instr)
In these cases, we iterate all IR instructions and generate all AssignBlocks
twice. Assuming that the expession simplifier is able to handlle an ExprAssign
(where we can define custom passes for the left and the right side), this would not be the case. However, way more code would have to be changed.
Hi @mrphrazer ,
In fact I was not talking about simplification rules, but about the replace_expr
. I agree with you for the double creation of assignent blocks. But maybe we can have something like:
replace_expr(left_tokens_replacement, right_tokens_replacement)
In this function we could manage left and right simultaneously, which will involve only one creation of basic block.
But I am curious about a thing: have you got some reduction rules example which may by applied on the right side of an expression and which should not be applied to the right one ? (ok, let say the replacement is a case appart)
Hi! Perhaps it is better to submit the PR first and discuss the details afterwards . Otherwise it might (or will) not make any sense to you.
Give me a few days, then I will take up the discussion here again.
This was quicker than intended:
See PR #1021 for more details.
The simplification pass for ExprAssign
is required to rewrite memory expressions as follows:
# ebx = @32[eax]
ebx = mem_read(M, eax, 32)
# @32[eax] = ebx
M = mem_write(M, eax, ebx, 32)
Do you have any suggestions how we could this implement in a clean manner?