rizin icon indicating copy to clipboard operation
rizin copied to clipboard

New command to print all RzIL instructions in function

Open XVilka opened this issue 1 year ago • 7 comments

Currently there are pdf and pif that print disassembly (enriched) and instructions for the current function. Ideally, having similar command but printing RzIL instead would be a good addition.

[0x100006948]> pdf
            ; CALL XREFS from sym.func.100005ddc @ 0x100005f34, 0x100006014
╭ sym.func.100006948(uint64_t arg1);
│           ; arg uint64_t arg1 @ x0
│           0x100006948      1f300071       cmp   w0, 0xc              ; arg1
│       ╭─< 0x10000694c      c2000054       b.hs  0x100006964
│       │   0x100006950      00840011       add   w0, w0, 0x21         ; arg1
│       │   0x100006954      d0071eca       eor   x16, x30, x30, lsl 1
│      ╭──< 0x100006958      5000f0b6       tbz   x16, 0x3e, 0x100006960
│      ││   0x10000695c      208e38d4       brk   0xc471
│     ╭╰──> 0x100006960      76030014       b     sym.imp.nl_langinfo  ; sym.imp.nl_langinfo
│     │ ╰─> 0x100006964      7f2303d5       pacibsp
│     │     0x100006968      fd7bbfa9       stp   x29, x30, [sp, -0x10]!
│     │     0x10000696c      fd030091       mov   x29, sp
╰     │     0x100006970      be020094       bl    sym.imp.abort        ; sym.imp.abort ; void abort(void)
[0x100006948]> pif
cmp w0, 0xc
b.hs 0x100006964
add w0, w0, 0x21
eor x16, x30, x30, lsl 1
tbz x16, 0x3e, 0x100006960
brk 0xc471
b sym.imp.nl_langinfo
pacibsp
stp x29, x30, [sp, -0x10]!
mov x29, sp
bl sym.imp.abort
[0x100006948]>

Currently it can be done with aoi @@i and aoi @@i @@b but not very obvious, also probably we should support "enriched" RzIL output, when addresses substituted with labels, etc, just like in pdf output:

[0x100006948]> aoi @@i
0x100006948 (seq (set a (cast 32 false (var x0))) (set b (bv 32 0xc)) (set r (- (var a) (var b))) (set cf (ule (var b) (var a))) (set vf (&& (^^ (msb (var a)) (msb (var b))) (^^ (msb (var a)) (msb (var r))))) (set zf (is_zero (var r))) (set nf (msb (var r))))
0x10000694c (branch (var cf) (jmp (bv 64 0x100006964)) nop)
[0x100006948]> aoi @@i @@b
WARNING: rz_il_op_effect_stringify: assertion 'op && sb' failed (line 1019)
WARNING: rz_il_op_effect_stringify: assertion 'op && sb' failed (line 1019)
0x100006948 (seq (set a (cast 32 false (var x0))) (set b (bv 32 0xc)) (set r (- (var a) (var b))) (set cf (ule (var b) (var a))) (set vf (&& (^^ (msb (var a)) (msb (var b))) (^^ (msb (var a)) (msb (var r))))) (set zf (is_zero (var r))) (set nf (msb (var r))))
0x10000694c (branch (var cf) (jmp (bv 64 0x100006964)) nop)
0x100006950 (set x0 (cast 64 false (+ (cast 32 false (var x0)) (bv 32 0x21))))
0x100006954 (set x16 (^ (var x30) (<< (var x30) (bv 6 0x1) false)))
0x100006958 (branch (lsb (>> (var x16) (bv 6 0x3e) false)) nop (jmp (bv 64 0x100006960)))
0x10000695c
0x100006960 (jmp (bv 64 0x100007738))
0x100006964
0x100006968 (seq (storew 0 (- (var sp) (bv 64 0x10)) (var x29)) (storew 0 (+ (- (var sp) (bv 64 0x10)) (bv 64 0x8)) (var x30)) (set sp (- (var sp) (bv 64 0x10))))
0x10000696c (set x29 (var sp))
0x100006970 (seq (set x30 (bv 64 0x100006974)) (jmp (bv 64 0x100007468)))

I propose two new commands - plf and pLf to print simple and "enriched" RzIL output of the function

XVilka avatar Jul 19 '23 08:07 XVilka

Nice idea. Though for readability it would be nice if it has a different syntax. (bv 32 0xc) is just hard to read all over again. Supporting Latex is probably off the table (but think about having something like $0\text{x}C_{32}$)?

Rot127 avatar Jul 19 '23 11:07 Rot127

Could you please explain more details of the expected "enriched" rzil output? Thank you! @XVilka

PeiweiHu avatar Aug 04 '23 03:08 PeiweiHu

@PeiweiHu similar to pd vs pi - ability to show reflines, addresses, bytes (I suggest taking the same options that control pd output). Also flags, global var names, type offsets instead of numbers where appropriate (in similar situations that pd would substitute those. You got the idea. Note though, that the line in RzIL is usually much longer than the line from pd output, so that should be accounted for.

XVilka avatar Aug 04 '23 03:08 XVilka

Nice idea. Though for readability it would be nice if it has a different syntax. (bv 32 0xc) is just hard to read all over again. Supporting Latex is probably off the table (but think about having something like 0xC32)?

@Rot127 using Unicode mathematical characters (optionally) might be a good idea, indeed. Let's make RzIL a new APL!

See also https://en.wikipedia.org/wiki/List_of_logic_symbols

XVilka avatar Aug 04 '23 10:08 XVilka

This would be lovely! Especially using subscripts should clean up a lot of syntax.

Rot127 avatar Aug 04 '23 11:08 Rot127

@XVilka Is this a good-first-issue (for more advanced people)? This would be a rough outline, how to implement IL printing with unicode symbols:

  • Extend rz_analyze_n_ins_il_pretty_handler or implement a new handler for the new syntax.
  • Change signature of rz_analyze_n_ins_il_pretty_handler to not just take a bool for pretty or not pretty, but an enum for the printing type (simple, pretty_tree, pretty_unicode or something).
  • Implement alternatives for il_op_effect_string_resolve and il_op_pure_string_resolve respectively.
  • Add tests for output.
  • Implment plf/pLf which print the simple and pretty_unicode for an analysed function.

Rot127 avatar Oct 02 '23 18:10 Rot127

Once pLf is implemented, it should be added in both Vv and V modes to allow obvserving IL visually, with reflines, comments, and addresses, just like pdf does: Screenshot 2024-01-11 at 8 23 51 PM

There is one issue though - some of the simple instructions can produce immensely long RzIL output. We need to decide how to display those.

XVilka avatar Jan 11 '24 12:01 XVilka