fq icon indicating copy to clipboard operation
fq copied to clipboard

WIP: x64_*,arm64: Add decoders

Open wader opened this issue 3 years ago • 5 comments
trafficstars

wader avatar Apr 01 '22 17:04 wader

I will probably not work on this for a while so let me know if your interested

wader avatar Apr 01 '22 17:04 wader

❤️

peterwaller-arm avatar Apr 12 '23 16:04 peterwaller-arm

Hey, kind of forgot about this one. There are some questions how decoding of ISA:s would be modelled in fq and when it would be done, i guess an option that is disabled by default for ELF:s, macho etc?

Do you have any idea or use cases how it would work?

wader avatar Apr 12 '23 16:04 wader

I don't especially have thoughts on how it should work, to be fair, but I like the principle that fq should be able to interpret bytes wherever and whatever they may be, if nothing else so that a human may make sense of those bytes at some point in the pipeline.

I take it what you're saying is that if you started parsing bytes out of an ELF for example, it would bloat the output quite dramatically?

The simplest case I can think of is that fq could be given an elf and would by default act as a disassembler, parsing code in executable sections much like objdump -d would. It remains to be seen whether this is useful/a good idea though?

peterwaller-arm avatar Apr 12 '23 17:04 peterwaller-arm

Rebased on master so you can try it. The disassembler is based on https://pkg.go.dev/golang.org/x/arch and there is some half-working symbol lookup support.

Looks like this:

➜  fq git:(isa2) ✗ go run . -o line_bytes=8 'grep_by(.name==".init").code | d' format/elf/testdata/linux_amd64/a_dynamic
      │00 01 02 03 04 05 06 07│01234567│.section_headers[8].code[0:5]:
      │                       │        │  [0]{}: instruction
0x1000│50                     │P       │    opcode: "50" (raw bits)
      │                       │        │    op: "push" (0x50000000)
      │                       │        │  [1]{}: instruction
0x1000│   e8 ca 01 00 00      │ .....  │    opcode: "e8ca010000" (raw bits)
      │                       │        │    op: "call" (0xe8000000)
      │                       │        │  [2]{}: instruction
0x1000│                  e8 35│      .5│    opcode: "e835020000" (raw bits)
0x1008│02 00 00               │...     │
      │                       │        │    op: "call" (0xe8000000)
      │                       │        │  [3]{}: instruction
0x1008│         58            │   X    │    opcode: "58" (raw bits)
      │                       │        │    op: "pop" (0x58000000)
      │                       │        │  [4]{}: instruction
0x1008│            c3         │    .   │    opcode: "c3" (raw bits)
      │                       │        │    op: "ret" (0xc3000000)

# same with objdump
➜  fq git:(isa2) ✗ objdump --section=.init -D format/elf/testdata/linux_amd64/a_dynamic

format/elf/testdata/linux_amd64/a_dynamic:      file format elf64-x86-64

Disassembly of section .init:

0000000000001000 <_init>:
    1000: 50                            pushq   %rax
    1001: e8 ca 01 00 00                callq   0x11d0 <frame_dummy>
    1006: e8 35 02 00 00                callq   0x1240 <__do_global_ctors_aux>
    100b: 58                            popq    %rax
    100c: c3                            retq

All very work-in-progress and mostly just an experiment, but looks promising i think. But there might be quite a bit of work to make it usable, for example maybe the disassembly output should be more standard? how to decode in elf etc, should instructions be split even more somehow? more isas?

wader avatar Apr 12 '23 19:04 wader