OdysseyDecomp icon indicating copy to clipboard operation
OdysseyDecomp copied to clipboard

data: Add `GUESS` to file list based on `llvm-nm`

Open MonsterDruide1 opened this issue 7 months ago • 2 comments

A recent poll has decided that this should be done.

In this PR, GUESS is added to all symbols from the file list that are either unknown (not given in the original) or still marked as '' anyways. For the latter, GUESS '' is used as new placeholder.

To achieve this, I exported a list of all symbols from data/main.elf using llvm-nm -DU data/main.elf > symbols.txt, then ran the following python snippet:

import yaml
import tqdm

def hexint_presenter(dumper, data):
    return dumper.represent_int(hex(data))
yaml.add_representer(int, hexint_presenter)

function_symbols = set()
with open("symbols.txt", "r") as file:
    for line in file:
        line = line.strip()
        function_symbols.add(line.split(" ")[-1])

with open('data/file_list.yml', 'r') as file:
    file_list = yaml.safe_load(file)

for file in tqdm.tqdm(file_list):
    functions = file_list[file]['.text']
    for (i, function_list) in enumerate(functions):
        offset = next(iter(function_list.keys()))
        function_name = str(function_list[offset])
        if " " in function_name:
            function_name = function_name.split(" ")[-1]
        if function_name == "":
            file_list[file]['.text'][i][offset] = "GUESS ''"
        elif function_name not in function_symbols:
            file_list[file]['.text'][i][offset] = "GUESS " + function_list[offset]

with open('data/file_list.yml', 'w') as file:
    yaml.dump(file_list, file, sort_keys=False)

This change is Reviewable

MonsterDruide1 avatar Apr 13 '25 22:04 MonsterDruide1

Ig that wouldn't help yeah. GUESS just seems wrong when we know the symbol, since it's not a guess. I didn't realize this when we had the poll for the name of the label, but if I had realized that I would've voted NOSYM. Also, I think having a GUESS label on functions that we haven't added a symbol for is confusing since it makes you think "What are we guessing here? The existence of the function?". I think an empty symbol is already enough to tell you that a function doesn't have a symbol and the GUESS label should only be added to the ones where we have added a proper guessed symbol

LynxDev2 avatar Apr 15 '25 16:04 LynxDev2

If you want to revisit the discussion about the meaning of the tags, please do so in the appropriate Discord thread - I guess we're currently stuck in discussion here otherwise.

MonsterDruide1 avatar Apr 16 '25 21:04 MonsterDruide1