Core: Store object name separately for symbols
This PR adds code that checks for extra information following symbol names that is often present in CodeWarrior symbol maps, namely the names of the .a/.o files that symbol belongs to. If present, the symbol name string is truncated to remove the extra text from the symbol name.
@dolphin-emu-bot rebuild
I'm changing this to draft as I think it would be better to also include code to handle the file name instead of just discarding it. I would like to also update the symbol view accordingly to still show file names but that might be better as a separate PR so I'm open to discussions on that.
Ok, ready for review again.
I suspect that this change can break other symbol map formats. Moreover, the object might not necessarily be a .o object file.
IIRC, the stripped symbol name is already available through the Symbol::function_name member. What I could suggest you instead is to provide a GUI toggle that can switch between displaying the stripped function name or the whole name including everything that's following it. It might be useful when working with relocated symbols where some symbols have the same name but different address and object.
Those are fair concerns; I do recall some map files having .c, .s, etc at the end. However, it would be very easy to remedy that issue - all that would need to be done is just to pick the last part of the split string, as it will always be the object file. I would be willing to test it against various map formats if so desired; I think that it would be a shame to just settle for a GUI toggle instead of being able to develop on top of this, perhaps adding the option to filter by object file (something I also want to try in the future if this is accepted). I feel that the debugging features in Dolphin have a lot of room for improvement and while I understand that this definitely needs to be made more robust, I would be willing to put in the necessary work.
Given how large the PR has grown, I suppose it is reasonable that only the object name split changes are included in this PR, and the rest can be considered after.
That being said, I did timing tests, and this is what I got:
Original:
Debug: 0.885 s
Release: 0.146 s
Branch:
Debug: 33.097 s
Release: 1.0899 s
So, needless to say, there is definitely lots of room for improvement.
I decided to essentially revert the regex changes for this PR, and add a few additional checks to handle getting the actual symbol name and object name. The performance is only slightly worse than the current mainline branch:
Debug: 1.507 s
Release: 0.187 s
@dolphin-emu-bot rebuild
@dolphin-emu-bot rebuild
@dolphin-emu-bot rebuild
The leading spaces are intentional.
This leading spaces change will likely break tools designed to load Dolphin generated symbol maps, especially if they don't take into account leading spaces.
There exist symbol maps generated without these leading spaces. However, I can't confirm that these were generated with CW.
I understand there likely are some tools that might break from this, however there also are many tools that account for, or even assume the CodeWarrior symbol map format, which always includes leading spaces before symbols. For example, https://github.com/Cuyler36/Ghidra-GameCube-Loader/ does not support Dolphin symbol maps as is.
This leading spaces change will likely break tools designed to load Dolphin generated symbol maps, especially if they don't take into account leading spaces. There exist symbol maps generated without these leading spaces. However, I can't confirm that these were generated with CW.
I understand there likely are some tools that might break from this, however there also are many tools that account for, or even assume the CodeWarrior symbol map format, which always includes leading spaces before symbols. For example, https://github.com/Cuyler36/Ghidra-GameCube-Loader/ does not support Dolphin symbol maps as is.
IIRC, Dolphin already provides scripts to save/load symbol maps from/to IDA/Ghidra in Tools.
Your PR is changing how symbol maps are saved, which wasn't the initial goal of your PR, nor was it in your initial code.
I'm fine with adding the object info to the Symbol struct. However, if your plan is to load/save CW symbol maps in a strict manner, then you should provide the appropriate functions instead of changing the existing ones. For instance, you can add in the GUI (the load/save dialog or a dedicated menu) another symbol map format support.
Fine, I'll compromise. I would like to discuss things further though later.
@sepalani Any updates?
I'll try with a few maps and see how it looks soon.
It would be nice to warn somehow that existing maps will be formatted, for those who do group collab/diffing - it could get messy if they don't know.
-800ab58c 00000030 800ab58c 0 HAS_Player::StartJumpCommand_AND_Player::TouchHangCommand(?Regular2PWorkingPulley?)_800ab58c_
+800ab58c 000030 800ab58c 0 HAS_Player::StartJumpCommand_AND_Player::TouchHangCommand(?Regular2PWorkingPulley?)_800ab58c_^M
[x] Old map format loads fine, saves in new format [x] .data section preserved as expected for non-functions [x] Total lines matching 1:1 - tested with GUP*.map and RSOs from RSVE8P [x] Test a game with full built-in symbols (Spyro A Hero's Tail)
Edit: If you saw my message about an issue with Spyro, disregard. I made a mistake in my diffing.
Approved with the caveat that I cannot explicitly test the .a/.o detection/storage part of the PR as none of my games (that I know of) have this condition.
My other user made maps and built in maps work as expected.
I don't have anything to test. All my personal symbols were switched over to an extra note-layer in my personal debugger builds, because I like to mark certain instructions on top of symbol marking the functions.