ghidra
ghidra copied to clipboard
.pyc Python disassembler
It would be great to add Python disassembler/decompiler to Ghidra.
Just to add my understanding of about what this would entail:
The .pyc
format itself is a version number, timestamp, and a serialized code object.
The code object is serialized using Python's marshal
module, which is "undocumented on purpose; [because] it may change between Python versions". This has previously changed across versions.
Additionally, within the (unmarshaled) code object, the co_code
attribute contains the raw Python bytecode. In regards to the bytecode itself, the official docs state that there is no guarantee "that bytecode will not be added, removed, or changed between versions of Python", which has historically meant a couple of changes every minor version from 1.1 to 3.8.
Essentially, if you want to support more than one version of the Python bytecode, you are looking at custom unmarshaling logic and bytecode changes for each minor Python release.
Somebody has been maintaining a python disassembling library that implements a cross-version marshal
and opcode tables for each version, xdis
, which might be a good reference for anybody who decides to implement this.
@lab313ru meanwhile you can check radare2 which has support for those (thus Cutter too - a radare2 GUI):
- https://github.com/radareorg/radare2/blob/master/libr/asm/p/asm_pyc.c
- https://github.com/radareorg/radare2/tree/master/libr/asm/arch/pyc
- https://github.com/radareorg/radare2/blob/master/libr/anal/p/anal_pyc.c