Created script for automatically generating function boundaries
Using the decompiled code exported from IDA as a HTML and the switch addresses that cause errors when using XenonRecomp (I deleted every line in SWA.toml after 'setjmp_address', (although I had to keep the invalid address block, else XenonRecomp wouldn't run), then ran XenonRecomp and saved the CLI output to a file), I wrote a script that reproduces 39/42 of the function boundaries in UnleashedRecomp. Below is the output of the script:
functions = [
{ address = 0x830B7DD0, size = 0x74 },
{ address = 0x82F098C0, size = 0x19C },
{ address = 0x826ABB70, size = 0x70 },
{ address = 0x8319ED58, size = 0x98 },
{ address = 0x82456DC8, size = 0xD4 },
{ address = 0x82DE36A8, size = 0x5C },
{ address = 0x82F852A0, size = 0xCC },
{ address = 0x82C980E8, size = 0x110 },
{ address = 0x82DE38A0, size = 0x16C },
{ address = 0x82EF5C38, size = 0x64 },
{ address = 0x82F1D668, size = 0x1E8 },
{ address = 0x82EE2D08, size = 0x154 },
{ address = 0x82F08730, size = 0x2B0 },
{ address = 0x82455E70, size = 0x84 },
{ address = 0x82E97E50, size = 0x84 },
{ address = 0x831530C8, size = 0x258 },
{ address = 0x82F13980, size = 0xF4 },
{ address = 0x82DE3708, size = 0x198 },
{ address = 0x82893088, size = 0x45C },
{ address = 0x831539E0, size = 0xD0 },
{ address = 0x82C49540, size = 0x114 },
{ address = 0x82E86770, size = 0x98 },
{ address = 0x83180700, size = 0x74 },
{ address = 0x83168F18, size = 0x254 },
{ address = 0x830DADA0, size = 0x150 },
{ address = 0x82DE3640, size = 0x64 },
{ address = 0x82F25FD8, size = 0x240 },
{ address = 0x82D9AC08, size = 0x78 },
{ address = 0x831487D0, size = 0xD4 },
{ address = 0x83168940, size = 0x100 },
{ address = 0x82CF7080, size = 0x80 },
{ address = 0x8317CD30, size = 0x50 },
{ address = 0x83168B70, size = 0x128 },
{ address = 0x82EF5D78, size = 0x3F8 },
{ address = 0x82DE35D8, size = 0x68 },
{ address = 0x83168A48, size = 0x11C },
{ address = 0x824E7EF0, size = 0x98 },
{ address = 0x8316C678, size = 0x78 },
{ address = 0x82F22908, size = 0x20C }
]
I verified these functions were correct using sort and diff in the Linux terminal. The only difference is the order, and that it is missing the following three functions:
{ address = 0x824E7F28, size = 0x60 }
{ address = 0x8305D168, size = 0x278 }
{ address = 0x831B0BA0, size = 0xA0 }
has an error
did you try to run it as an idapython script? cause if so i think you're suppose to run it as a regular python script outside of idapro. you're suppose to take the log from runnning xenonrecomp and put it in a text file and then use idapro to make an html of default.xex and then run python directory/name of xex file directory/name of log file for xenonrecomp output name of file.toml
Edit: i was correct, it worked for the most part, it only missed 2 function boundaries for destroy all humans path of the furon which was easy for me to find 1 of, the others been a bit wonky so i wasnt surprised it couldnt find it.
I got hit by these error when I'm trying to run it
F:\Xenon\Recompilation\game\Auto_Function_Parser.py:134: SyntaxWarning: invalid escape sequence '\.'
elif re.search('^\.text:'+curr_addr+' </span><span class="c[0-9]*">loc_'+curr_addr, line):
F:\Xenon\Recompilation\game\Auto_Function_Parser.py:166: SyntaxWarning: invalid escape sequence '\.'
elif num_functs > 0 and re.search('<span class="c[0-9]*">\.long </span><span class="c[0-9]*">0$', line):
F:\Xenon\Recompilation\game\Auto_Function_Parser.py:201: SyntaxWarning: invalid escape sequence '\.'
if re.search('<span class="c[0-9]*">\.section "\.text"', line) != None:
I got hit by these error when I'm trying to run it
F:\Xenon\Recompilation\game\Auto_Function_Parser.py:134: SyntaxWarning: invalid escape sequence '\.' elif re.search('^\.text:'+curr_addr+' </span><span class="c[0-9]*">loc_'+curr_addr, line): F:\Xenon\Recompilation\game\Auto_Function_Parser.py:166: SyntaxWarning: invalid escape sequence '\.' elif num_functs > 0 and re.search('<span class="c[0-9]*">\.long </span><span class="c[0-9]*">0$', line): F:\Xenon\Recompilation\game\Auto_Function_Parser.py:201: SyntaxWarning: invalid escape sequence '\.' if re.search('<span class="c[0-9]*">\.section "\.text"', line) != None:
I didn't test the script in Windows, can you try replacing every instance of \. with . and see if that works?
EDIT: I think your issue might be related to this, https://stackoverflow.com/questions/52335970/how-to-fix-syntaxwarning-invalid-escape-sequence-in-python I tested my script with Python 3.11, try using an older version of Python without changing the script.
EDIT2: I pushed an update of the script tested to work with Python 3.12, let me know if this worked
I got hit by these error when I'm trying to run it
F:\Xenon\Recompilation\game\Auto_Function_Parser.py:134: SyntaxWarning: invalid escape sequence '\.' elif re.search('^\.text:'+curr_addr+' </span><span class="c[0-9]*">loc_'+curr_addr, line): F:\Xenon\Recompilation\game\Auto_Function_Parser.py:166: SyntaxWarning: invalid escape sequence '\.' elif num_functs > 0 and re.search('<span class="c[0-9]*">\.long </span><span class="c[0-9]*">0$', line): F:\Xenon\Recompilation\game\Auto_Function_Parser.py:201: SyntaxWarning: invalid escape sequence '\.' if re.search('<span class="c[0-9]*">\.section "\.text"', line) != None:I didn't test the script in Windows, can you try replacing every instance of . with . and see if that works?
EDIT: I think your issue might be related to this, https://stackoverflow.com/questions/52335970/how-to-fix-syntaxwarning-invalid-escape-sequence-in-python I tested my script with Python 3.11, try using an older version of Python without changing the script.
EDIT2: I pushed an update of the script tested to work with Python 3.12, let me know if this worked
That did a trick, runs without errors now
Edit 1: was using an older version of the parser script without noticing, Current version from 11 hours ago as of me typing this seems to be working, though I'm currently waiting for this to be done with the html file since that for whatever reason is 1gb in size.
Edit 2: For whatever reason it came out as this. "Parsing XenonRecomp log... Parsing IDA HTML... Searching for needed functions... 0 functions found! Outputting to formatted file..."
I honestly don't know why it didn't even find the needed functions given they were pretty much in the IDA HTML, especially with them also being listed in the XenonRecomp log.
Tried doing this with the Open Season Video Game as an example, I'm stuck on this error with it pointing to line 122 in the error.
"H:\Documents\TimberlineRecompiled>parser.py default.xex.html default.xex.txt OSGAME_NEW.toml
Parsing XenonRecomp log...
Parsing IDA HTML...
Traceback (most recent call last):
File "H:\Documents\TimberlineRecompiled\parser.py", line 122, in
Edit 1: was using an older version of the parser script without noticing, Current version from 11 hours ago as of me typing this seems to be working, though I'm currently waiting for this to be done with the html file since that for whatever reason is 1gb in size.
Edit 2: For whatever reason it came out as this. "Parsing XenonRecomp log... Parsing IDA HTML... Searching for needed functions... 0 functions found! Outputting to formatted file..."
I honestly don't know why it didn't even find the needed functions given they were pretty much in the IDA HTML, especially with them also being listed in the XenonRecomp log.
Tried doing this with the Open Season Video Game as an example, I'm stuck on this error with it pointing to line 122 in the error. "H:\Documents\TimberlineRecompiled>parser.py default.xex.html default.xex.txt OSGAME_NEW.toml Parsing XenonRecomp log... Parsing IDA HTML... Traceback (most recent call last): File "H:\Documents\TimberlineRecompiled\parser.py", line 122, in if not compare_xref_addr(line, functs[num_functs-1][0]): ~~~~~~^^^^^^^^^^^^^^ IndexError: list index out of range"
compare_xref_addr isn't a function in the newest version of the script, nor is it referred to on line 122. Try deleting your version and downloading it from https://raw.githubusercontent.com/hedge-dev/XenonRecomp/8fc280bed99903d7bfaf1003e18cfec0c627141d/Auto_Function_Parser.py
Would it be possible to integrate your detection here to XenonAnalyse in some way?
compare_xref_addr isn't a function in the newest version of the script, nor is it referred to on line 122. Try deleting your version and downloading it from https://raw.githubusercontent.com/hedge-dev/XenonRecomp/8fc280bed99903d7bfaf1003e18cfec0c627141d/Auto_Function_Parser.py
Tried this, but had the same result. Though I did update to a different fork of XenonRecomp to try and deal with other issues, though not much had changed tbh.
This is the result as before.
"Parsing XenonRecomp log... Parsing IDA HTML... Searching for needed functions... 0 functions found! Outputting to formatted file..."
The fork is from this https://github.com/hedge-dev/XenonRecomp/pull/22
Would it be possible to integrate your detection here to XenonAnalyse in some way?
Yeah... although parsing IDA output is a bit easier than plain decompilation though because most subroutine headers will tell you what references it rather than having to look through every line for references and saving it. I went through the path of least resistance so I could quickly churn this out (hence using Python)
compare_xref_addr isn't a function in the newest version of the script, nor is it referred to on line 122. Try deleting your version and downloading it from https://raw.githubusercontent.com/hedge-dev/XenonRecomp/8fc280bed99903d7bfaf1003e18cfec0c627141d/Auto_Function_Parser.py
Tried this, but had the same result. Though I did update to a different fork of XenonRecomp to try and deal with other issues, though not much had changed tbh.
This is the result as before.
"Parsing XenonRecomp log... Parsing IDA HTML... Searching for needed functions... 0 functions found! Outputting to formatted file..."
The fork is from this #22
I used this script with Ninja Gaiden 2 using a combination of the simde, Bakugan, and NG2 forks and it found 155 functions, so I don't think that's the issue. Are you using IDA Pro 9.0SP1 to create the HTML? If so, can you try adding "print(switch_addrs)" to line 49 and "print(functs)" to line 203 to see if either of those lists are empty?
Would it be possible to integrate your detection here to XenonAnalyse in some way?
Yeah... although parsing IDA output is a bit easier than plain decompilation though because most subroutine headers will tell you what references it rather than having to look through every line for references and saving it. I went through the path of least resistance so I could quickly churn this out (hence using Python)
compare_xref_addr isn't a function in the newest version of the script, nor is it referred to on line 122. Try deleting your version and downloading it from https://raw.githubusercontent.com/hedge-dev/XenonRecomp/8fc280bed99903d7bfaf1003e18cfec0c627141d/Auto_Function_Parser.py
Tried this, but had the same result. Though I did update to a different fork of XenonRecomp to try and deal with other issues, though not much had changed tbh. This is the result as before.
"Parsing XenonRecomp log... Parsing IDA HTML... Searching for needed functions... 0 functions found! Outputting to formatted file..."
The fork is from this #22
I used this script with Ninja Gaiden 2 using a combination of the simde, Bakugan, and NG2 forks and it found 155 functions on the first pass and 12 on the second so I don't think that's the issue. Are you using IDA Pro 9.0SP1 to create the HTML? If so, can you try adding "print(switch_addrs)" to line 49 and "print(functs)" to line 203 to see if either of those lists are empty?
i feel a lot of peoples issues are an expectation for using idapro 9.0 sp1 which most people arent aware of how to get without paying an absurd amount for it
i feel a lot of peoples issues are an expectation for using idapro 9.0 sp1 which most people arent aware of how to get without paying an absurd amount for it
Yes, I do realize the issue with that, but on the other hand, I don't think anyone can make substantial progress on a recomp port if they don't have it.
i feel a lot of peoples issues are an expectation for using idapro 9.0 sp1 which most people arent aware of how to get without paying an absurd amount for it
Yes, I do realize the issue with that, but on the other hand, I don't think anyone can make substantial progress on a recomp port if they don't have it.
or at least not in a short time. i wonder if it can work with ghidra
The only IDAPro 9.0 I have is a leaked build from August 2024, though it should be working relatively the same, so I don't know what's going on here tbh. I can try to work on this a little further if I can.
i feel a lot of peoples issues are an expectation for using idapro 9.0 sp1 which most people arent aware of how to get without paying an absurd amount for it
Yes, I do realize the issue with that, but on the other hand, I don't think anyone can make substantial progress on a recomp port if they don't have it.
or at least not in a short time. i wonder if it can work with ghidra
Ghidra has the big problem that it lacks a lot of the special instructions used on the Xenon processor, making reverse engineering very difficult.
Ghidra has the big problem that it lacks a lot of the special instructions used on the Xenon processor, making reverse engineering very difficult.
This is fair. I wish it had a better plugin for xex files like idapro does
Sorry,could you clarify what the CLI output file is that you are refering to.
Sorry,could you clarify what the CLI output file is that you are refering to.
In a Linux terminal when you run XenonRecomp, append " > out.txt" to the command so it outputs all the stuff it would print to the terminal to a file instead
Ok, so this script won't work in windows?
On Thu 3 Apr 2025, 17:50 Jillian To, @.***> wrote:
Sorry,could you clarify what the CLI output file is that you are refering to.
In a Linux terminal when you run XenonRecomp, append " > out.txt" to the command so it outputs all the stuff it would print to the terminal to a file instead
— Reply to this email directly, view it on GitHub https://github.com/hedge-dev/XenonRecomp/pull/98#issuecomment-2776400767, or unsubscribe https://github.com/notifications/unsubscribe-auth/BEIJWMKBGYH2C6C6IOY2MQD2XVRGNAVCNFSM6AAAAABYUY5F4OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONZWGQYDANZWG4 . You are receiving this because you commented.Message ID: @.***> [image: JillianTo]JillianTo left a comment (hedge-dev/XenonRecomp#98) https://github.com/hedge-dev/XenonRecomp/pull/98#issuecomment-2776400767
Sorry,could you clarify what the CLI output file is that you are refering to.
In a Linux terminal when you run XenonRecomp, append " > out.txt" to the command so it outputs all the stuff it would print to the terminal to a file instead
— Reply to this email directly, view it on GitHub https://github.com/hedge-dev/XenonRecomp/pull/98#issuecomment-2776400767, or unsubscribe https://github.com/notifications/unsubscribe-auth/BEIJWMKBGYH2C6C6IOY2MQD2XVRGNAVCNFSM6AAAAABYUY5F4OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONZWGQYDANZWG4 . You are receiving this because you commented.Message ID: @.***>
the tool managed to find 50% of the functions that xenonrecomp considers to be wrong, but for some reason it couldn't find the others, and when I run the tool again this time using the new log with the rest of the missing functions it gives me half of the ones that that had already been found before
Ok, so this script won't work in windows?
You can try copy pasting the output of XenonRecomp from the Windows prompt into a text file but I haven't verified that
the tool managed to find 50% of the functions that xenonrecomp considers to be wrong, but for some reason it couldn't find the others, and when I run the tool again this time using the new log with the rest of the missing functions it gives me half of the ones that that had already been found before
That's because the leftover 50% are functions nested in the ones found by the script, and the script doesn't handle that
What should Xenon recomp be outputting?
On Fri 4 Apr 2025, 00:22 Jillian To, @.***> wrote:
Ok, so this script won't work in windows?
You can try copy pasting the output of XenonRecomp from the Windows prompt into a text file but I haven't verified that
the tool managed to find 50% of the functions that xenonrecomp considers to be wrong, but for some reason it couldn't find the others, and when I run the tool again this time using the new log with the rest of the missing functions it gives me half of the ones that that had already been found before
That's because the leftover 50% are functions nested in the ones found by the script, and the script doesn't handle that
— Reply to this email directly, view it on GitHub https://github.com/hedge-dev/XenonRecomp/pull/98#issuecomment-2777196905, or unsubscribe https://github.com/notifications/unsubscribe-auth/BEIJWMPQGEGHQISERUYHYWT2XW7E3AVCNFSM6AAAAABYUY5F4OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONZXGE4TMOJQGU . You are receiving this because you commented.Message ID: @.***> [image: JillianTo]JillianTo left a comment (hedge-dev/XenonRecomp#98) https://github.com/hedge-dev/XenonRecomp/pull/98#issuecomment-2777196905
Ok, so this script won't work in windows?
You can try copy pasting the output of XenonRecomp from the Windows prompt into a text file but I haven't verified that
the tool managed to find 50% of the functions that xenonrecomp considers to be wrong, but for some reason it couldn't find the others, and when I run the tool again this time using the new log with the rest of the missing functions it gives me half of the ones that that had already been found before
That's because the leftover 50% are functions nested in the ones found by the script, and the script doesn't handle that
— Reply to this email directly, view it on GitHub https://github.com/hedge-dev/XenonRecomp/pull/98#issuecomment-2777196905, or unsubscribe https://github.com/notifications/unsubscribe-auth/BEIJWMPQGEGHQISERUYHYWT2XW7E3AVCNFSM6AAAAABYUY5F4OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONZXGE4TMOJQGU . You are receiving this because you commented.Message ID: @.***>
What should Xenon recomp be outputting?
When you look at the command window it should just say what percentage has been recompiled, if it says that an address is going out of bounds from another address or something it means you need to add function boundaries to your config (what this script should do) if it says it found a jump table at an address but the switch table file you made with xenonanalyse doesn't exist it means you need to go to that address and find it's jump table. And if it says something about an unrecognized instruction you have to edit recompiler.cpp and add those instructions with the right c++ code and rebuild xenonrecomp
Ok, I'm not getting any of that when run the recomp. I also only get part of the jump table, it won't give me any computed or offset values. I have changed the instructions in main.cpp but it doesn't detect anything else. There are some instructions present that aren't in unleashed recompiled. Does something else need to be done to make them work? The instructions are subi and ori.
On Fri 4 Apr 2025, 01:21 MadLadMikael, @.***> wrote:
What should Xenon recomp be outputting?
When you look at the command window it should just say what percentage has been recompiled, if it says that an address is going out of bounds from another address or something it means you need to add function boundaries to your config (what this script should do) if it says it found a jump table at an address but the switch table file you made with xenonanalyse doesn't exist it means you need to go to that address and find it's jump table. And if it says something about an unrecognized instruction you have to edit recompiler.cpp and add those instructions with the right c++ code and rebuild xenonrecomp
— Reply to this email directly, view it on GitHub https://github.com/hedge-dev/XenonRecomp/pull/98#issuecomment-2777261954, or unsubscribe https://github.com/notifications/unsubscribe-auth/BEIJWMNQSHHI5OJLYO3ZBXD2XXGBHAVCNFSM6AAAAABYUY5F4OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONZXGI3DCOJVGQ . You are receiving this because you commented.Message ID: @.***> [image: masterspike52]masterspike52 left a comment (hedge-dev/XenonRecomp#98) https://github.com/hedge-dev/XenonRecomp/pull/98#issuecomment-2777261954
What should Xenon recomp be outputting?
When you look at the command window it should just say what percentage has been recompiled, if it says that an address is going out of bounds from another address or something it means you need to add function boundaries to your config (what this script should do) if it says it found a jump table at an address but the switch table file you made with xenonanalyse doesn't exist it means you need to go to that address and find it's jump table. And if it says something about an unrecognized instruction you have to edit recompiler.cpp and add those instructions with the right c++ code and rebuild xenonrecomp
— Reply to this email directly, view it on GitHub https://github.com/hedge-dev/XenonRecomp/pull/98#issuecomment-2777261954, or unsubscribe https://github.com/notifications/unsubscribe-auth/BEIJWMNQSHHI5OJLYO3ZBXD2XXGBHAVCNFSM6AAAAABYUY5F4OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONZXGI3DCOJVGQ . You are receiving this because you commented.Message ID: @.***>
I think Im nearly there, but Im getting this error.
C:\Users\Nevan\Desktop\function test>python Auto_Function_Parser.py IDA.html XenonRecomp_Log.txt Output.toml
Parsing XenonRecomp log...
Parsing IDA HTML...
Searching for needed functions...
Traceback (most recent call last):
File "C:\Users\Nevan\Desktop\function test\Auto_Function_Parser.py", line 225, in
How can I fix it?
Tested TGM ACE TU1 with IDA Pro 9.1, ran without issue and found 41/42 functions. This greatly helped with process, nice work!
Script output: functions = [ { address = 0x8229BBB8, size = 0x78 }, { address = 0x82333720, size = 0xB4 }, { address = 0x821511C0, size = 0xE4 }, { address = 0x82326AA0, size = 0x494 }, { address = 0x820F0D50, size = 0x2BC }, { address = 0x821068C0, size = 0x1E4 }, { address = 0x821AD620, size = 0xA8 }, { address = 0x82077440, size = 0x10C }, { address = 0x822EF668, size = 0x198 }, { address = 0x8208ABE8, size = 0x2C4 }, { address = 0x822EF9D8, size = 0x108 }, { address = 0x8209F108, size = 0x78 }, { address = 0x822F1C88, size = 0x104 }, { address = 0x8230D700, size = 0x1F8 }, { address = 0x82073388, size = 0x774 }, { address = 0x82090438, size = 0x148 }, { address = 0x82075FC0, size = 0x78 }, { address = 0x821696D8, size = 0xB0 }, { address = 0x820886D8, size = 0x280 }, { address = 0x822EF2F0, size = 0x64 }, { address = 0x8210CFC0, size = 0x9C }, { address = 0x820F1CD0, size = 0x154 }, { address = 0x822EF0A8, size = 0xB8 }, { address = 0x8223FA58, size = 0x7C }, { address = 0x8215FDB0, size = 0xA4 }, { address = 0x822EF160, size = 0x80 }, { address = 0x82117218, size = 0x140 }, { address = 0x82077D08, size = 0xD4 }, { address = 0x8209E368, size = 0x60 }, { address = 0x822EF580, size = 0x58 }, { address = 0x820907F8, size = 0xC8 }, { address = 0x8214B588, size = 0x10C }, { address = 0x82184140, size = 0x618 }, { address = 0x822EF1E0, size = 0x7C }, { address = 0x822EF288, size = 0x68 }, { address = 0x821F75D0, size = 0x148 }, { address = 0x821A7C10, size = 0x114 }, { address = 0x822EF800, size = 0x16C }, { address = 0x822EEFE8, size = 0xBC }, { address = 0x82077550, size = 0x154 }, { address = 0x822EF5D8, size = 0x5C } ]
Missed function: { address = 0x82184658, size = 0x100 }
Hi, after parsing the .html i've got 0 functions on my log. Only: "functions = ]" Could someone give me some help?
Was having issues with the "0 functions found!" thing, if it helps anyone, make sure your XenonRecomp text dump uses UTF-8 and not any other encoding. For some reason (at least for me) whenever I use ">" to output cmd to a text file, it uses UTF-16. Seems that this script doens't read the dump correctly if it uses any encoding other than UTF-8. I used notepad++ to change it.
has an error