ccc icon indicating copy to clipboard operation
ccc copied to clipboard

A library and set of command line tools for parsing debugging symbols from PS2 games, with a focus on STABS symbols from .mdebug sections.

Chaos Compiler Collection

A library and set of command line tools for parsing debugging symbols from PS2 games. The 1.x series of releases are focused on STABS symbols in .mdebug sections, however the version on the main branch can also parse standard ELF symbols and SNDLL linker symbols. DWARF support is planned.

Tools

demangle

Demangler for the old GNU ABI.

objdump

Half-working EE core MIPS disassembler. Probably not too interesting.

stdump

Symbol table parser and dumper. It can extract the following information:

  • Data types (structs, unions, enums, etc)
  • Functions (name, return type, parameters and local variables)
  • Global variables

The following output formats are supported:

  • C++
  • JSON

This is intended to be used with ghidra-emotionengine-reloaded (>= 2.1.0 or one of the unstable builds) to import all of this information into Ghidra. Note that despite the name the STABS analyzer should work for the R3000 (IOP) and possibly other MIPS processors as well.

uncc

This is similar to stdump except it organizes its output into separate source files, and has a number of extra features designed to try and make said output closer to valid source code. A SOURCES.txt file must be provided in the output directory, which can be generated using the stdump files command (you should fixup the paths manually so that they're relative to the output directory, and remove the address comments). Additionally, non-empty files that do not start with // STATUS: NOT STARTED will not be overwritten.

If a FUNCTIONS.txt file is provided in the output directory, as can be generated using the included CCCDecompileAllFunctions.java script for Ghidra, the code from that file will be used to populate the function bodies in the output. In this case, the first group of local variable declarations emitted will be those recovered from the symbols, and the second group will be from the code provided in the functions file. Function names are demangled.

Global variable data will be printed in a structured way based on its data type.

Data types will be sorted into their corresponding files. Since this information is not stored in the symbol table, uncc uses heuristics to map types to files. Types will be put in .c or .cpp files when there is only a single translation unit the type appears in, and .h files when there are multiple (and hence when heuristics must be used to determine where to put them).

Use of a code formatter such as clang-format or astyle on the output is recommended.

Building

cmake -B bin/
cmake --build bin/

Documentation

Chaos Compiler Collection

  • Compiler Bugs
  • JSON Format
  • Project Structure
  • Symbol Database

DWARF (.debug) Section

MIPS Debug (.mdebug) Section

MIPS EABI

STABS

License

The source code for the CCC library and associated command line tools is released under the MIT license.

The GNU demangler is used, which contains source files licensed under the GPL and the LGPL. RapidJSON is used under the MIT license. The GoogleTest library is used by the test suite under the 3-Clause BSD license.