XbSymbolDatabase
XbSymbolDatabase copied to clipboard
Feature: Add Parameter Info into Symbol Group
ergo720 mentioned about ability to store argument info within function's data. While it's under consideration, each symbol group cannot have same name. One of the possibility is to understand in what order they were in non-LTCG symbols and follow that example with LTCG symbols. Such as...
non-LTCG:
int example_func(void* arg1, unsigned int arg2, short arg3, int arg4)
LTCG:
int example_func(unsigned int arg2, int arg4)
+ arg1 is store in ecx and arg3 is store in dx.
which then can roughly translate into...
non-LTCG:
example_func_16
LTCG:
example_func_8__LTCG_ecx1_dx3
(as shortest way possible)
Plus document what is the purpose of ecx and dx registers are in the comment section for validation. Eventually, we will need to document their arguments or leave as unknown.
Unless such function doesn't exist in non-LTCG, then we may replace number to Unk
.
Finally, we could extend and pass these information up to third-party usage for maybe better understanding for both non-LTCG and LTCG. Although, this method will consume some spaces.
However, this is still under consideration and may need further discussion.
NOTE: 1) Currently, we only have iterator of every OOVPA signatures into one list per library than per symbol group. 2) Will likely use extend function APIs to obtain these information and pass it up to third-party usage.
Once pull request #149 is merged, we will need some type of concept how to perform this task.
I'm working on ideas how to implement this for both as symbol name string and maybe some sort of extended API to able return where each argument is stored in without strings and have a name string for what they are.
Draft concept code I can think of right now are:
SYMBOL(symbol_name, ...)
SYMBOL_LTCG(symbol_name, ...)
// For generate symbol's suffix name and extend API usage.
ARGS(...)
// type = enum if plan to extend API
// name = as string if plan to extend API, otherwise only there for cosmetic look as unused.
ARG(type, name)
and for the example we could ideally expect is: non-LTCG:
SYMBOL(D3DDevice_LoadVertexShader,
ARGS(
ARG(1, Handle),
ARG(2, Address)
)
)
which will generate as D3DDevice_LoadVertexShader
symbol name.
LTCG:
SYMBOL_LTCG(D3DDevice_LoadVertexShader,
ARGS(
ARG(arg, Handle),
ARG(arg, Address)
)
)
which will generate as D3DDevice_LoadVertexShader__LTCG
.
LTCG altered:
SYMBOL_LTCG(D3DDevice_LoadVertexShader_0,
ARGS(
ARG(eax, Handle),
ARG(ecx, Address)
)
)
which will generate as D3DDevice_LoadVertexShader_0__LTCG_eax1_ecx2
.
_0
manual input is a requirement.
Now for the next thing is... macro preprocessor cannot tell earlier stage of arg inputs if they are not stored in registers. For that... We need some sort of compromise.
Any traction on this? I maintain a tool for reversing Xbox binaries (https://github.com/xclusivor/binaryninja-xbe) and parse the output of your tool for symbol recovery. Return types, argument types and variable names would be very useful.
Apologies for the delay on this feature. I haven't started coding it into the project yet. However, I will start planning to work on this in June. Which will be in baby steps until able to generate a printed format on the console.
The reason I had this feature delayed is due to other symbol signatures I wanted to upstream. Yet it has some complications with how complex the assembly functions of these symbols are.
No worries at all. I'd like to contribute to this if possible. I have a few questions. Do you have discord? I got by my github username in the Cxbx-Reloaded channel.
Here's some update:
- I am successfully able to upgrade the symbol group files to include stack/parameters support.
- Due to some symbols having a high amount of parameters, the OOVPA signature revisions had to go on the next line. clang-format isn't complaining about this, aka the good news.
- Due to some symbols later either introduced or removed parameters. There are some complications with current OOVPA signatures missing revision for additional/removed parameters. (For now, these symbols will have initial parameters with a TODO comment that requires fixing.)
- All LTCG symbols have been updated to use the current standardized stack and register references.
- Since there have been several rewrites in the codebase and some fixes went into the wrong commit. It will take longer to clean them out.
After some consideration and cleaning up the commits, I'm determined to have a new branch upstream which will remain a work in progress until every symbol function has its parameters filled in.