miasm icon indicating copy to clipboard operation
miasm copied to clipboard

Simplification passes for ExprAssign

Open mrphrazer opened this issue 5 years ago • 7 comments

Hi!

I'm preparing a PR. For this, I have to apply simplification rules for ExprAssign which has to perform different transformation for src and dst.

ira_cfg = ira.new_ircfg_from_asmcfg(asm_cfg)
ira_cfg.simplify(expr_simp_high_to_explicit)
expr_simp = ExpressionSimplifier ()
expr_simp.enable_passes({ExprAssign: [my_simp]})
ira_cfg.simplify(expr_simp)

Since simplify from AssignBlock applies the same operation to src and dst:

for dst, src in viewitems(self):
    if dst == src:
        continue
    new_src = simplifier(src)
    new_dst = simplifier(dst)

I patched it as follows:

for dst, src in viewitems(self):
    if my_simp_flag:
        e = self.dst2ExprAssign(dst)
        rewritten = simplifier(e)
        new_src = rewritten.src
        new_dst = rewritten.dst
    else:
        if dst == src:
            continue
        new_src = simplifier(src)
        new_dst = simplifier(dst)

Obviously, this is not how it should be done. What do you think would be a good way to apply this?

mrphrazer avatar Mar 13 '19 18:03 mrphrazer

Hello,

Hum, for now, I would rather explictely call the simplifier in your script, instead of modifying AssignBlock.simplify in Miasm. We have this kind of code in several place in Miasm, like here: https://github.com/cea-sec/miasm/blob/master/miasm/analysis/outofssa.py#L383

commial avatar Mar 15 '19 13:03 commial

Hi!

I could do this, but in this case I think I would also have to modify/recreate all Assign/IR blocks in ira_cfg manually, if I want to perform further analysis ?

I am currently looking for a clean solution since I'm planning to introduce a new PR that requires performing simplifications of ExprAssign on the graph level in a first step before applying SSA.

mrphrazer avatar Mar 29 '19 23:03 mrphrazer

Ok. Just some remarks: For us, the ExprAssign is a weird word in Miasm, as it's more a statement than a right/left value. But as it belongs to the Expr class for now, maybe we can consider that expression simplifier can deal with it: The patch for this may be a little one: the case ExprAssign has to be added in the expression simplifier cases.

But it may trigger some new behavior:

  • For now, the replace_expr is coded as a visitor. If we do a replace_expr on it, both left value and right value will be modified by the replace_expr. The problem here is that we may want to only replace sources of the expressions. If we do constant propagation for example let say in:
@32[EAX] = @32[EAX] + 1

and let say we have concluded in a previous analysis that @32[EAX] can be replaced by 0x1337BEEF. Here we clearly want that the replace_expr on the ExprAssign gives:

@32[EAX] = 0x1337BEEF + 1

and not:

0x1337BEEF = 0x1337BEEF + 1

So the conclusion maybe that we may have to

  • modify the replace_expr
  • or add a new api, kind of: replace_righ_values and replace_left_values which take this problem into account.

For me I think we have to take this problem into account and maybe the second solution is the good for. Today, we are using replace_expr and try to twist it's behavior to match our goal but the real solution should be to have explicit and clear APIs for this. Also, It will make clearer what in Miasm is a right/left value, which seems a good point to me :smile:

What do you think about this?

serpilliere avatar Mar 30 '19 11:03 serpilliere

Hi!

I think both approaches have advantges and disadvantages. On the short term, introducing replace_right_values and replace_left_values seems for sure way more feasible (perhaps simplify_lhs and simplify_rhs are better wordings?). However, on the long term this is not the most ideal solution in terms of clean code and unnecessary computations.

Lets take for instance the following:

ira_cfg.simplify_lhs(expr_simp_lhs)
ira_cfg.simplify_rhs(expr_simp_rhs)

Lets assume simplify_lhs and simplify_rhs look as follows:

    def simplify_lhs(self, simplifier):
        """
        Return a new AssignBlock with expression simplified
        @simplifier: ExpressionSimplifier instance
        """
        new_assignblk = {}
        for dst, src in viewitems(self):
            new_dst = simplifier(dst)
            new_assignblk[new_dst] = src
        return AssignBlock(irs=new_assignblk, instr=self.instr)

In these cases, we iterate all IR instructions and generate all AssignBlocks twice. Assuming that the expession simplifier is able to handlle an ExprAssign (where we can define custom passes for the left and the right side), this would not be the case. However, way more code would have to be changed.

mrphrazer avatar Mar 31 '19 02:03 mrphrazer

Hi @mrphrazer ,

In fact I was not talking about simplification rules, but about the replace_expr. I agree with you for the double creation of assignent blocks. But maybe we can have something like:

replace_expr(left_tokens_replacement, right_tokens_replacement)

In this function we could manage left and right simultaneously, which will involve only one creation of basic block.

But I am curious about a thing: have you got some reduction rules example which may by applied on the right side of an expression and which should not be applied to the right one ? (ok, let say the replacement is a case appart)

serpilliere avatar Apr 01 '19 05:04 serpilliere

Hi! Perhaps it is better to submit the PR first and discuss the details afterwards . Otherwise it might (or will) not make any sense to you.

Give me a few days, then I will take up the discussion here again.

mrphrazer avatar Apr 01 '19 17:04 mrphrazer

This was quicker than intended:

See PR #1021 for more details.

The simplification pass for ExprAssign is required to rewrite memory expressions as follows:

# ebx = @32[eax]
ebx = mem_read(M, eax, 32)

# @32[eax] = ebx
M = mem_write(M, eax, ebx, 32)

Do you have any suggestions how we could this implement in a clean manner?

mrphrazer avatar Apr 01 '19 19:04 mrphrazer