ghidra
ghidra copied to clipboard
DOS MZ "LE" loader support
Describe the bug
When loading certain MZ executables (in this specific instance, the Quarantine DOS game), Ghidra detects it as a Raw Binary.
Expected behavior It should be detected as a DOS MZ executable
Environment (please complete the following information):
- OS: Linux
- Java Version: 11.0
- Ghidra Version: 9.0.2
Sorry you're having issues. Could you provide an ascii hex dump of the first 0x40 bytes of the binary? Please don't attach the actual binary. Ghidra decides if a binary is a DOS/MZ by checking the values of e_magic, e_lfanew, and e_lfarlc from the DOS header. The checks might be incorrect for some cases.
Thanks!
Thanks for looking into this issue. Sure:
Attached it as .log because github didn't support attaching a .exe. The executable runs fine under DOSBOX by the way.
Hey! I see what's happening here. All WIN/PE's contain an embedded DOS/MZ. GHIDRA determines if a binary is a plain DOS/MZ (vs WIN/PE) by checking several fields in the DOS_HEADER. GHIDRA expects "e_lfanew==0x0", but in your binary the value is 0x2998.
Could you tell me what int value exists at offset 0x2998 in your binary?
Thanks!
0000:0000 4d 5a 92 IMAGE_DOS_HEADER
0000:0000 4d 5a char[2] "MZ" e_magic
0000:0002 92 01 dw 192h e_cblp
0000:0004 15 00 dw 15h e_cp
0000:0006 06 00 dw 6h e_crlc
0000:0008 06 00 dw 6h e_cparhdr
0000:000a 86 00 dw 86h e_minalloc
0000:000c ff ff dw FFFFh e_maxalloc
0000:000e 99 02 dw 299h e_ss
0000:0010 00 08 dw 800h e_sp
0000:0012 00 00 dw 0h e_csum
0000:0014 16 02 dw 216h e_ip
0000:0016 00 00 dw 0h e_cs
0000:0018 40 00 dw 40h e_lfarlc
0000:001a 00 00 dw 0h e_ovno
0000:001c 00 00 00 00 00 dw[4] e_res[4]
0000:0024 00 00 dw 0h e_oemid
0000:0026 00 00 dw 0h e_oeminfo
0000:0028 00 00 00 00 00 dw[10] e_res2[10]
0000:003c 98 29 00 00 ddw 2998h e_lfanew
The value is 0x4C450000
. Maybe it's IMAGE_OS2_SIGNATURE_LE
?
Yes, that's it. Sorry, GHIDRA doesn't support the LE format.
Thank you for following up on this. Should I open a new issue for tracking adding support for the LE format?
I altered this issue to reflect your requested enhancement.
Same issue with original Tomb Raider game for DOS.
You can try my: https://github.com/oshogbo/ghidra-lx-loader
@oshogbo Awesome work, however I believe LX and LE are different executable formats, so I don't think it would help with this issue specifically unfortunately.
@tzizi I'm also not sure. Although could you point me the LE documentation? They suggests that the LE format is similar to the LX. https://github.com/BoomerangDecompiler/boomerang/blob/next/loader/exe/dos4gw/DOS4GWBinaryFile.cpp and http://fileformats.archiveteam.org/wiki/Linear_Executable
Can you clarify what you mean by "the LE documentation"?
Your fileformats.archiveteam.org link's "Specifications" section does already link to two different documents, one which describes a format with "LX" as its signature bytes and another which describes a format with "LE" as its signature bytes, so my first impulse would be to think that you overlooked what you're asking for in what you linked to.
@ssokolow If you look into the "Specifications" they look quite the same :) And the "http://faydoc.tripod.com/formats/exe-LE.htm" doesn't look like anything close to the official one. The loader I build look for both magic values "LE" and "LX".
Ahh. Given the context, I suspect the only more official source will be either a print programmers' reference on a used book seller like AbeBooks.com or, if someone took the time to scan it in, maybe Archive.org.
@ssokolow Sorry If I was unclear. :) Non the less I would try to use the loader for the LE. Please notice that I do not implement all the object types yet, but if you would point me to some binary, I may find time to look into it.
...but this might prove useful.
https://www.program-transformation.org/Transform/PcExeFormat
Since it combines all things that could have a .exe
header into a single unified document, it's more likely to draw distinctions between LE and LX clearly.
I started writing an LE (not LX) parser a while back, in C#. I found the format to be incredibly alien compared to modern COFF/PE, so I didn't get very far, but it does print out the complete header information block.
Here it is in a gist: https://gist.github.com/gsuberland/1ec41a34f1c904fec69f91d875dd184d
One of the references I found most useful was https://github.com/open-watcom/open-watcom-v2/blob/master/bld/watcom/h/exeflat.h
Hopefully this is at least useful as a reference point if someone else wants to work on this in future.
Currently, LE executables (VxD and blobs extracted from a DOS/4GW executable) are detected as MZ.
In both cases:
-
e_lfanew
points to the LE header information block - size as calculated from
e_cp
ande_cblp
is too short for the payload (if that is taken into consideration)