OpenTESArena
OpenTESArena copied to clipboard
[Future Enhancement] Fan translations support (and improved .EXE unpacking)
Hello.
Ok, this is a low priority thing, but i want to share that exists fan translations, but at least at the moment and at least the Spanish translation doesn't work with OpenTESArena. When you launch it when a translated Arena game, you have this, and immediately closes (it's hard to pick that window).
It works perfectly fine with the vanilla game in DOSBox (used with the GOG/CD version).
The Spanish fan translation can be downloaded here http://traducciones.clandlan.net/index.php?page=download&file=AS/TESArenav2.2.1.7z (Web in Spanish, to download, write the letters in the box in the center and push the button "¡DESCARGA!"), and look the differences that make works vanilla but not with OpenTESArena (although i suppose you will have an idea or two without seeing anything knowing how it works).
I write this only to point this out, to give the knowledge about these translations and helps to the devs make the right decisions beforehand to support them, to hope in the future they can be used with OpenTESArena too if the port takes more shape.
Thanks in advance.
I haven't tested it but I imagine if you copy everything except A.EXE/ACD.EXE then it would work. That translation seems to have been made by unpacking the original executable, modifying strings in it, and repacking it. This would cause A.EXE/ACD.EXE to have differences in the binary data, and OpenTESArena's executable unpacker doesn't seem to be compatible.
It's good to know this is an issue. I think it mostly means that ExeUnpacker.cpp
isn't flexible enough for what the PKLITE specification allows.
Additionally, there are files that come with OpenTESArena for reading data from A.EXE/ACD.EXE. These only work with specific versions of the game and are a bit labor-intensive to make, which means that fan modifications of those executables will need their own acdExeStrings.txt
and/or aExeStrings.txt
. It would be better if OpenTESArena had its own localization format with key-value pairs for every string in the game, but that is still too far in the future to talk about, I think.
Actually, the A.EXE and ACD.EXE don't seem to be compressed with PKLITE at all, they're just regular DOS executables. Not sure yet if this is something the engine could conditionally handle by looking at bytes in the executable, but it still would need custom aExeStrings.txt
and acdExeStrings.txt
.
I don't even know that the original Arena executables are in reality compressed files. Is interesting because maybe the translators somehow uncompressed the data to change it and recreate the exe to always work with the data uncompressed and make easy to change things.
The only thing i can say is the translation works fine, and it was in the works at least until 2013.
Like you explain, maybe OpenTESArena could expect compressed and uncompressed EXEs instead to expect only the original compressed versions. Maybe those lines can be translated in the port itself instead to be extracted from the EXE to override that issue. Or, seeing the aExeStrings.txt
and acdExeStrings.txt
, can be generated to match the translation and that's it, although i see strange that something like that can't be auto-generated, because these kinds of translations (translating directly the EXE) normally that text data are in a fixed position in the executable and using the same spacing, size and limiters than the original translation or they could not work at all. Maybe this one is different, idk.
Anyway, is something a bit far in the future, but it's something good to know, It's certain that there should be more translations and, in that case, knowing how they are can be helpful for the topic.
Thanks!
Not sure yet if this is something the engine could conditionally handle by looking at bytes in the executable
The packed A.EXE from 1.06 and ACD.EXE from 1.07 both have
PKLITE Copr. 1990-91 PKWARE Inc. All Rights Reserved
near the top of the file, while the unpacked A.EXE does not have it (I haven't checked the unpacked ACD.EXE but I assume it would be the same). So maybe using that would work.
I don't even know that the original Arena executables are in reality compressed files. Is interesting because maybe the translators somehow uncompressed the data to change it and recreate the exe to always work with the data uncompressed and make easy to change things.
Looking at the A.EXE file in the disk
directory from that download, and it looks like the translated text is at the same addresses in the file as in the unpacked English A.EXE (version 1.06), so I would guess they edited the text within the unpacked file while keeping the start of each string at the same location as in the original file.
There were also several bytes in the file outside of the translated text that differed, though.
it still would need custom
aExeStrings.txt
andacdExeStrings.txt
.
Since the translated strings seem to all be at the same locations, wouldn't these files be fine as is? Is the problem maybe just that OpenTESArena expects A.EXE/ACD.EXE in its packed form and won't load an unpacked form?
I tried packing the translated ACD.EXE from the download with PKLite v1.12 and running it with OpenTESArena, but it fails with Invalid last compressed word "0x74"
. It starts successfully from DOSBox-X.
Same thing happens with the translated A.EXE from the download. I packed it with PKLite v1.12 and it starts successfully in DOSBox-X. In OpenTESArena it fails with Invalid last compressed word "0x70"
.
Edit: Of course, to solve the issue of translation support, skipping the unpacking process for A.EXE/ACD.EXE when they don't need it may be enough. But I was curious if packing the translated files would cause them to work with OpenTESArena.
Hmm. At https://github.com/afritz1/OpenTESArena/blob/main/docs/pklite_specification.md it says
If l is the length of the executable in bytes, then the compressed data is stored from byte at position 0x2F0 up until l - 8 within the executable. The compressed data should end with 0xFFFF.
and OpenTESArena checks for 0xFFFF, showing an "Invalid last compressed word" error if it isn't there. The original A.EXE has 0xFFFF in the right place. Here are the final 14 bytes of the file.
B2 6E B6 6E BA 6E C2 70 C6 70 FF FF 57 4A 80 00 00 00 00 00
But the translated A.EXE that I packed with PKLite 1.12 has these final 14 bytes.
6E B2 6E B6 6E BA 6E C2 70 C6 70 00 57 4A 7C 00 00 00 00 00
Where the original A.EXE has 0xFFFF, the one that was produced from running PKLite 1.12 on the translated A.EXE has 0x00. As I wrote above, this file does successfully start (testing with DOSBox-X).
So is this check for 0xFFFF incorrect or unnecessary?
Another difference between what OpenTESArena assumes for the PKLite 1.12 specification and the files I got from packing the translated .exe files is the start of compressed data.
OpenTESArena assumes byte 0x2F0 (752). In the packed, translated files, the equivalent data (the values are a little different) appear to start at byte 0x300 (768).
English .EXE data starting at 0x2F0: 00 00 B5 8E 36 B8 3B C7 Equivalent Spanish .EXE data starting at 0x300: 00 00 BA 80 3B B4 30 CD
OpenTESArena still won't load the packed Spanish EXEs even if the check for 0xFFFF is removed and the start offset is set to 768, though.
More information: Since I had only tested the packed Spanish EXEs as far as the title screen, I tried taking them in-game just to be sure they work, and they do appear to work properly.
Based on this site http://fileformats.archiveteam.org/wiki/PKLITE, in addition to 1.12, PKLite versions 1.05 and 1.13 should also fit the "1990-91" copyright seen in A.EXE and ACD.EXE. In case the discrepancies were from a different version of PKLite than 1.12 being used, I tried packing the Spanish EXEs with both of these versions as well, but I got the same differences from the original English .EXE files as I did with 1.12.
More information: Another difference: OpenTESArena gets the total decompressed file size with
const uint16_t segment = Bytes::getLE16(compressedEnd);
const uint16_t offset = Bytes::getLE16(compressedEnd + 2);
return (segment * 16) + offset;
In the original packed English A.EXE, the segment is 0x4A75 and the offset is 0x0080. In the Spanish packed A.EXE, the the segment is 0x4A75 but the offset is 0x007C.
According to https://www.fileformat.info/format/exe/corion-mz.htm the PKLite version is in the file header. Checking the original A.EXE file it shows itself as version 1.12. It also shows the "extra compression" flag set, which is "only available in PKLite Professional version".
From the PKLite 1.12 documentation:
-e Use Extra Compression Method
(* Option available only in PKLITE Professional version *)
This option is used to produce the smallest executable files. It
uses a slightly different algorithm, which also scrambles the
excutable file. This scrambling makes the executable data more
resistant to disassembly or "reverse engineering" procedures.
After a file is compressed using this method, it cannot be
expanded to match the original executable file. If you attempt
to expand it using the -x option, PKLITE will return a message
stating the file cannot be expanded. This option is ideal for
software developers who wish to distribute their programs in
compressed form.
So maybe that's the reason for the discrepancies. Maybe the PKLite decompression used by OpenTESArena works specifically for extra-compressed EXEs.
Anyway sorry if this was not useful information or if you were already aware of all this.
That appears to have indeed been the reason. When I packed the Spanish A.EXE with PKLite 1.12 Professional using the -e option, the discrepancies about 0xFFFF and starting offsets I mentioned above went away. You might want to amend https://github.com/afritz1/OpenTESArena/blob/main/docs/pklite_specification.md, which currently says "This specification should work with any executable compressed with PKLITE V1.12", to say that it only is for executables that were compressed using the extra compression option.
While OpenTESArena will get past the executable decompression step, it still won't start with the -e packed Spanish A.EXE or the -e packed Spanish ACD.EXE. In both cases it closes with
[Assets/BinaryAssetLibrary.cpp(346)] Initializing binary assets.
[Rendering/Renderer.cpp(75)] Closing.
[src/Main.cpp(25)] Error: Exception: invalid vector subscript
It seems that in ExeUnpacker::init
, while in the while (true) loop it never reaches the encryptedByte == 0xFF
break condition when running with a -e packed Spanish file (testing with ACD.EXE).
With the -e packed Spanish files, encryptedByte == 0xE6
is reached for ACD.EXE and encryptedByte == 0xF4
is reached for A.EXE.
Changing line
else if (encryptedByte == 0xFF)
to
else if (encryptedByte == 0xFF || encryptedByte == 0xE6 || encryptedByte == 0xF4)
in ExeUnpacker.cpp
allows the Spanish -e packed A.EXE and ACD.EXE, as well as the original English ones to all run in OpenTESArena. I've only done light testing but it seems to run without issue.
Perhaps "If the byte is 0xFF, then Duplication should be aborted, and the decompression process is finished." in https://github.com/afritz1/OpenTESArena/blob/main/docs/pklite_specification.md will also need to be amended, if these other values are also valid.
Thanks for looking into this so much @Allofich! Sounds like ExeUnpacker.cpp
has some decent room for improvement.
I still want to get through my rendering branch first but this is really useful information. I think the tl;dr is that ExeUnpacker.cpp
just needs to be more data-driven, use more logic, and be less hardcoded to Arena's executables.
This part in the https://www.fileformat.info/format/exe/corion-mz.htm link seems useful:
---PKLITE compressed executable
OFFSET Count TYPE Description
001Ch 1 byte Minor version number
001Dh 1 byte Bit mapped :
0-3 - major version
4 - Extra compression
5 - Multi-segment file
001Eh 6 char ID='PKLITE'