ghidra icon indicating copy to clipboard operation
ghidra copied to clipboard

.pyc Python disassembler

Open lab313ru opened this issue 4 years ago • 2 comments

It would be great to add Python disassembler/decompiler to Ghidra.

lab313ru avatar Oct 21 '19 15:10 lab313ru

Just to add my understanding of about what this would entail:

The .pyc format itself is a version number, timestamp, and a serialized code object.

The code object is serialized using Python's marshal module, which is "undocumented on purpose; [because] it may change between Python versions". This has previously changed across versions.

Additionally, within the (unmarshaled) code object, the co_code attribute contains the raw Python bytecode. In regards to the bytecode itself, the official docs state that there is no guarantee "that bytecode will not be added, removed, or changed between versions of Python", which has historically meant a couple of changes every minor version from 1.1 to 3.8.

Essentially, if you want to support more than one version of the Python bytecode, you are looking at custom unmarshaling logic and bytecode changes for each minor Python release.

Somebody has been maintaining a python disassembling library that implements a cross-version marshal and opcode tables for each version, xdis, which might be a good reference for anybody who decides to implement this.

Andoryuuta avatar Oct 27 '19 17:10 Andoryuuta

@lab313ru meanwhile you can check radare2 which has support for those (thus Cutter too - a radare2 GUI):

  • https://github.com/radareorg/radare2/blob/master/libr/asm/p/asm_pyc.c
  • https://github.com/radareorg/radare2/tree/master/libr/asm/arch/pyc
  • https://github.com/radareorg/radare2/blob/master/libr/anal/p/anal_pyc.c

XVilka avatar Nov 20 '20 09:11 XVilka