radeco icon indicating copy to clipboard operation
radeco copied to clipboard

Make radeco universal

Open Escapingbug opened this issue 5 years ago • 7 comments

In my previous issue in #259 , it seems quite possible to make radeco-lib independent of radare2, thus making everything universal.

I have dug a little bit, and have some thought about how to implement this. But as I'm quite new to this project and have no knowledge about r2, some discussion and instructions are needed for me to proceed.

If I'm right about this, current most important communication with r2 includes:

  • Within RadecoProject and RadecoModule: provide some useful information such as calling-convention, register profiles, function symbols, etc.
  • SSA construction from ESIL to Radeco IL

So my idea about this:

  • For calling-convention and register profile information: use Provider to provide them, a Provider then works as a communication layer between disassembler (radare2, currently) and Radeco-lib. For each useful information, there will be a trait to describe what is needed for Provider to implement.
  • As for SSA construction, in order to reuse construction algorithm, my thought is to invent a new Low IR. This Low IR mostly does what ESIL can do, just be a communication layer between original unknown IR disassembler uses and SSA construction algorithm's input. Then, for a new disassembler, just translate its original IR into this Low IR. Since most IR used by disassemblers remains in non-SSA form, it should be easier to port into Low IR rather than into SSA-like Radeco IR directly.

Overall, after the modification, the workflow will be:

  • RadecoModule saves the Provider, which is specified when constructing it.
  • Any information need can be retrieved from Provider
  • When constructing SSA, Provider converts its original form IR into Low IR.

And the instruction I currently need:

  • I have no idea what the "register profile" should look like in Radare2. I have looked up for documentation on this, but little have I found. I know this should describe the registers, but I need more details to define a trait for Providers to implement.
  • Although I have read ESIL's documentation, there are things that I do not quite understand. In The x86 REP prefix in ESIL part I saw control flow related instructions that seem different from normal opcodes. But as I read from example, the control flow is actually an assignment to RIP (which, well, comes back to register profile problem again). Then how are they supposed to be used? And is this documentation complete? I see "TODO"s inside..

And also, if we all agree to make radeco-lib (and radeco as well maybe?) universal, some decisions may work differently as before. Currently, I see a lot of issues concerning the integration with r2. However, I highly recommend this to happen as this project really has such potential to be not just r2.

Escapingbug avatar Jun 02 '19 02:06 Escapingbug

I would say lets go for making it universal. Suggestions for the better interface are welcome.

cc @chinmaydd @sushant94 @kriw @radare @wargio @condret @thestr4ng3r

XVilka avatar Jun 07 '19 03:06 XVilka

i agree on going universal, but i don't know which type of interface would fit best for this purpose.

wargio avatar Jun 07 '19 11:06 wargio

What I think we at least need 3 groups:

  • information decompilation need (architecture info, language info, symbol info...)
  • operations decompilation wants to perform on backend provider to improve backend's result (like tag function names, global symbol names..)
  • specifications, like architecture spec (mainly to understand register names and how they work), language spec (to know about calling conventions, how arguments are passed), low level IR...

Escapingbug avatar Jun 07 '19 14:06 Escapingbug

Currently, I see we have esil as the low-level language. If we insist so, we then need to have better support for lifting other IR into esil. A better construction process is what we have.

Escapingbug avatar Jun 07 '19 14:06 Escapingbug

@Escapingbug for architecture and OS/cc info we have already https://github.com/radareorg/radeco-lib/tree/master/arch-rs, it is not yet properly integrated but this is easier than writing from scratch. For the second point we have this in mind. And for the third point - haha, this is the hardest one.

XVilka avatar Jun 07 '19 14:06 XVilka

As for SSA Construction, I think we can use Radeco IR (non-SSA form) as Low IR. Radeco IR does not have to be SSA form, so I guess we can easily reuse the IR.

kriw avatar Jun 07 '19 22:06 kriw

Agree on this point on using Radeco IL as the lower level IR. We might add some changes if required though, but in general should be fine.

On Sat, Jun 8, 2019, 6:42 AM kriw [email protected] wrote:

As for SSA Construction, I think we can use Radeco IR (non-SSA form) as Low IR. Radeco IR does not have to be SSA form, so I guess we can easily reuse the IR.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/radareorg/radeco-lib/issues/261?email_source=notifications&email_token=AABRT7LDH3FDZUNPNURUXCTPZLP3RA5CNFSM4HSBWC72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXHFKDI#issuecomment-500061453, or mute the thread https://github.com/notifications/unsubscribe-auth/AABRT7NL6632PWLJL2RDXW3PZLP3RANCNFSM4HSBWC7Q .

XVilka avatar Jun 08 '19 01:06 XVilka