ghidra-allegrex icon indicating copy to clipboard operation
ghidra-allegrex copied to clipboard

Binary with debug symbols is oddly imported

Open Nemoumbra opened this issue 1 year ago • 10 comments

The game "Yu-Gi-Oh! Duel Monsters GX: Tag Force" (goes by ULJM-05151) was NOT stripped of the debug info before release (quite unusual, if you ask me).

I tried loading this in Ghidra, but it just didn't go well. The symbols were parsed correctly and I can see them in the functions view, but the code is just.... gone. It's the .text section, why is there data defined? Something's extremely odd with that.

image

The section that contains NIDs is totally busted... image

And one more, for good measure... image

Question... Who is here to blame, the Ghidra ELF loader or the plugin? Maybe it's worth reporting to the main app devs?

Nemoumbra avatar Dec 09 '23 16:12 Nemoumbra

I've found another game with debug symbols for testing: Puzzle Bobble Pocket JP. Ghidra's misbehaving there too. And I've also noticed there are these error bookmarks:

image

Nemoumbra avatar Dec 09 '23 18:12 Nemoumbra

I don't think this is plugin fault, the same happens when importing as normal MIPS. This can be fixed pretty easily by clearing the auto defined arrays before doing the analysis, the biggest one being at 0883a898. See #9,

kotcrab avatar Dec 12 '23 22:12 kotcrab

As you can see on the third screenshot, there are multiple purple labels at the address 08804000. The fact that there's mentions of .lib.stub and .lib.ent makes me think something's wrong with the way Ghidra applied the debug info. This can't be fixed by a simple "Ctrl+A, C (clear)" in the .text, can it? As a result, Ghidra decided to name the function there as sceKernelRegisterSubIntrHandler. That's complete nonsense, and the only way to fix that would be to rely on PPSSPP's sym files. I don't trust them too much though, because they tend to incorrectly compute the function bodies.

So the question is... what should be fixed for the labels to be created where they're needed? I'm ready to report that to the Ghidra devs if we can come up with an explanation of what's going on.

Nemoumbra avatar Dec 12 '23 22:12 Nemoumbra

Might be also just that those debug symbols are weird, given how the same happens when this file is imported as MIPS. It's either that or some Ghidra issue. What exactly is broken needs deeper investigation, I won't be looking into that as nothing suggests this is plugin issue.

kotcrab avatar Dec 13 '23 16:12 kotcrab

The debug symbols being wrong is unlikely: IDA managed to load the files and apply the function names and labels correctly. This is likely a Ghidra issue, so I'm reporting this to the main repo.

Let's keep the issue open here for now.

Nemoumbra avatar Dec 14 '23 18:12 Nemoumbra

I've got an update on the situation... Small recap: were 2 issues: the data in the code sections and the labels. The related commits by ghidra1 have not been released to the public yet (milestone == 11.1).

I pulled the latest upstream for Ghidra (which was this at the time) to test if it works and I finally managed to build your plugin! This is what I uncovered:

  1. Ghidra stopped covering code with undefined arrays. Not just that, it's got an option now that can fully disable this feature.
  2. The Allegrex plugin still suffers from misplacing the labels.
  3. The MIPS loader properly places the labels for the debug info. Too bad we can't use it due to mismatching instruction sets + it only works if we don't rebase the image.

I think now it's no longer the base Ghidra's problem.

Nemoumbra avatar Apr 06 '24 15:04 Nemoumbra

New info: Ghidra finally updated to 11.1. The relevant commits are included there so we can revisit this problem. Just a small note: ghidra1 asked you

Can you inspect an etype==0xffa0 binary which you believe is not relocatable and check the ElfSymbol values if they are relative or absolute.

and I didn't see you answer. I think they hardcoded the 0xffa0 value in the code (see this). If there are non-relocated binaries with 0xffa0, this may end up being slightly troublesome.

Nemoumbra avatar Jun 08 '24 16:06 Nemoumbra

Do you know of any game which is not relocatable and has debug symbols? I'm not sure but I guess their question is only relevant in that context so I don't know the answer to it.

kotcrab avatar Jun 08 '24 22:06 kotcrab

Yes, in fact, I do! But you know what's the trick? These games don't have the 0xffa0 for e_type. To be honest, all games built with the Metrowerks Codewarrior (MW MIPS C Compiler) seemingly are not relocated in this manner and so Ghidra always identifies the base address as 0x08804000 without me having to enter it manually (which I'm forced to do for 0xffa0 games).

Right, the debug symbols AND not relocated... Aces of War (EU). But I think they asked for a game with 0xffa0 that is not relocated.

Nemoumbra avatar Jun 08 '24 22:06 Nemoumbra

I checked my games, the non relocatable have e_type=0x2.

But I think they asked for a game with 0xffa0 that is not relocated.

Yep. Seems like their assumption is okay.

Ghidra always identifies the base address as 0x08804000

Yes, this is taken from the ELF's program header.

kotcrab avatar Jun 08 '24 22:06 kotcrab

I've found some free time to return to our problem. Our current upstream version for Ghidra is 11.1.2, but they have not added anything particularly relevant to the case since my last comment. I've conducted a few tests with my Ghidra 11.1.1 & Allegrex 11.1 and I'm ready to reveal the results. I used two games with debug symbols: Yu-Gi-Oh! Duel Monsters GX: Tag Force represented the relocated games (e_type == 0xffa0) and Aces of War represented the non-relocated games (e_type == 0x2).


Yu-Gi-Oh! Duel Monsters GX: Tag Force:

  1. Allegrex, apply undefined symbol data: bad labels, undefined data over code. :x:
  2. Allegrex, don't apply undefined symbol data: bad labels, no data over code, but also no data over the data symbols. :x:

The image base doesn't matter here.

Conclusion

The Allegrex plugin incorrectly handles the labels which undermines the whole idea of using the debug symbols as the further analysis is impossible unless we clear all the labels.

  1. MIPS, base = 0x08804000: the labels are correct, no undefined data over code, but Ghidra is unable to process the relocations => it's all SUB_0000910c and func_0x0000910c in the code. :x:
  2. MIPS, base = 0x0: the labels are correct, no undefined data over code, the calls work fine, but Ghidra doesn't understand the Allegrex instruction set + Ghidra mixes up the normal integer values and the pointers. :x:

Switching apply undefined symbol data on/off seemingly didn't affect the results so I'm starting to think it's broken for MIPS here.

Conclusion

The labels are placed like they should be, but there's a reason why we need a separate plugin for the Allegrex architecture. We can't use the default MIPS.


Aces of War (EU):

  1. Allegrex, base = 0x08804000, apply undefined symbol data: good lables, the data is not placed over the code, but it's placed over the actual data. ✔️
  2. Allegrex, base = 0x08804000, don't apply undefined symbol data: same as before, but the data is not placed over the data (just as we asked!). ✔️

Trying the base 0x0 or the MIPS loader is not required here.

Conclusion

The binaries that are not relocated are fully supported by the Allegrex plugin.

Nemoumbra avatar Jul 29 '24 23:07 Nemoumbra

@Nemoumbra I synced Ghidra changes, can you retest with build from https://github.com/kotcrab/ghidra-allegrex/actions/runs/10644246173?

kotcrab avatar Aug 31 '24 09:08 kotcrab

Actually better if you can test with https://github.com/kotcrab/ghidra-allegrex/actions/runs/10644441954 as I cleaned some old hacky stuff

kotcrab avatar Aug 31 '24 10:08 kotcrab

Actually better if you can test with https://github.com/kotcrab/ghidra-allegrex/actions/runs/10644441954 as I cleaned some old hacky stuff

Yu-Gi-Oh! Duel Monsters GX: Tag Force:

  1. Allegrex, apply undefined symbol data: good labels, the data is not placed over the code, but it's placed over the actual data. ✔️
  2. Allegrex, don't apply undefined symbol data: same as before, but the data is not placed over the data (just as we asked!) ✔️

Trying the base 0x0 or the MIPS loader is not required here.


Aces of War (EU):

No regression detected

Conclusion

The latest commit fully fixes the issue.

Nemoumbra avatar Aug 31 '24 12:08 Nemoumbra

Great, thanks for testing.

kotcrab avatar Aug 31 '24 14:08 kotcrab