ghidra icon indicating copy to clipboard operation
ghidra copied to clipboard

DOS MZ "LE" loader support

Open tzizi opened this issue 5 years ago • 19 comments

Describe the bug

When loading certain MZ executables (in this specific instance, the Quarantine DOS game), Ghidra detects it as a Raw Binary.

Expected behavior It should be detected as a DOS MZ executable

Environment (please complete the following information):

  • OS: Linux
  • Java Version: 11.0
  • Ghidra Version: 9.0.2

tzizi avatar Apr 29 '19 01:04 tzizi

Sorry you're having issues. Could you provide an ascii hex dump of the first 0x40 bytes of the binary? Please don't attach the actual binary. Ghidra decides if a binary is a DOS/MZ by checking the values of e_magic, e_lfanew, and e_lfarlc from the DOS header. The checks might be incorrect for some cases.

Thanks!

GhidraKnight avatar May 01 '19 15:05 GhidraKnight

Thanks for looking into this issue. Sure:

Q_40.log

Attached it as .log because github didn't support attaching a .exe. The executable runs fine under DOSBOX by the way.

tzizi avatar May 02 '19 04:05 tzizi

Sorry, I read 40 bytes, not 0x40. Here's the first 64 bytes as requested:

Q_40.log

tzizi avatar May 02 '19 04:05 tzizi

Hey! I see what's happening here. All WIN/PE's contain an embedded DOS/MZ. GHIDRA determines if a binary is a plain DOS/MZ (vs WIN/PE) by checking several fields in the DOS_HEADER. GHIDRA expects "e_lfanew==0x0", but in your binary the value is 0x2998.

Could you tell me what int value exists at offset 0x2998 in your binary?

Thanks!

       0000:0000 4d 5a 92        IMAGE_DOS_HEADER
          0000:0000 4d 5a           char[2]   "MZ"                    e_magic
          0000:0002 92 01           dw        192h                    e_cblp
          0000:0004 15 00           dw        15h                     e_cp
          0000:0006 06 00           dw        6h                      e_crlc
          0000:0008 06 00           dw        6h                      e_cparhdr
          0000:000a 86 00           dw        86h                     e_minalloc
          0000:000c ff ff           dw        FFFFh                   e_maxalloc
          0000:000e 99 02           dw        299h                    e_ss
          0000:0010 00 08           dw        800h                    e_sp
          0000:0012 00 00           dw        0h                      e_csum
          0000:0014 16 02           dw        216h                    e_ip
          0000:0016 00 00           dw        0h                      e_cs
          0000:0018 40 00           dw        40h                     e_lfarlc
          0000:001a 00 00           dw        0h                      e_ovno
          0000:001c 00 00 00 00 00  dw[4]                             e_res[4]
          0000:0024 00 00           dw        0h                      e_oemid
          0000:0026 00 00           dw        0h                      e_oeminfo
          0000:0028 00 00 00 00 00  dw[10]                            e_res2[10]
          0000:003c 98 29 00 00     ddw       2998h                   e_lfanew

GhidraKnight avatar May 07 '19 15:05 GhidraKnight

The value is 0x4C450000. Maybe it's IMAGE_OS2_SIGNATURE_LE?

tzizi avatar May 07 '19 16:05 tzizi

Yes, that's it. Sorry, GHIDRA doesn't support the LE format.

GhidraKnight avatar May 07 '19 16:05 GhidraKnight

Thank you for following up on this. Should I open a new issue for tracking adding support for the LE format?

tzizi avatar May 07 '19 17:05 tzizi

I altered this issue to reflect your requested enhancement.

ryanmkurtz avatar May 07 '19 17:05 ryanmkurtz

Same issue with original Tomb Raider game for DOS.

Gh0stBlade avatar Jul 31 '19 23:07 Gh0stBlade

You can try my: https://github.com/oshogbo/ghidra-lx-loader

oshogbo avatar Oct 16 '19 19:10 oshogbo

@oshogbo Awesome work, however I believe LX and LE are different executable formats, so I don't think it would help with this issue specifically unfortunately.

tzizi avatar Oct 22 '19 19:10 tzizi

@tzizi I'm also not sure. Although could you point me the LE documentation? They suggests that the LE format is similar to the LX. https://github.com/BoomerangDecompiler/boomerang/blob/next/loader/exe/dos4gw/DOS4GWBinaryFile.cpp and http://fileformats.archiveteam.org/wiki/Linear_Executable

oshogbo avatar Oct 22 '19 20:10 oshogbo

Can you clarify what you mean by "the LE documentation"?

Your fileformats.archiveteam.org link's "Specifications" section does already link to two different documents, one which describes a format with "LX" as its signature bytes and another which describes a format with "LE" as its signature bytes, so my first impulse would be to think that you overlooked what you're asking for in what you linked to.

ssokolow avatar Oct 22 '19 20:10 ssokolow

@ssokolow If you look into the "Specifications" they look quite the same :) And the "http://faydoc.tripod.com/formats/exe-LE.htm" doesn't look like anything close to the official one. The loader I build look for both magic values "LE" and "LX".

oshogbo avatar Oct 22 '19 20:10 oshogbo

Ahh. Given the context, I suspect the only more official source will be either a print programmers' reference on a used book seller like AbeBooks.com or, if someone took the time to scan it in, maybe Archive.org.

ssokolow avatar Oct 22 '19 20:10 ssokolow

@ssokolow Sorry If I was unclear. :) Non the less I would try to use the loader for the LE. Please notice that I do not implement all the object types yet, but if you would point me to some binary, I may find time to look into it.

oshogbo avatar Oct 22 '19 20:10 oshogbo

...but this might prove useful.

https://www.program-transformation.org/Transform/PcExeFormat

Since it combines all things that could have a .exe header into a single unified document, it's more likely to draw distinctions between LE and LX clearly.

ssokolow avatar Oct 22 '19 20:10 ssokolow

I started writing an LE (not LX) parser a while back, in C#. I found the format to be incredibly alien compared to modern COFF/PE, so I didn't get very far, but it does print out the complete header information block.

Here it is in a gist: https://gist.github.com/gsuberland/1ec41a34f1c904fec69f91d875dd184d

One of the references I found most useful was https://github.com/open-watcom/open-watcom-v2/blob/master/bld/watcom/h/exeflat.h

Hopefully this is at least useful as a reference point if someone else wants to work on this in future.

gsuberland avatar Oct 24 '21 17:10 gsuberland

Currently, LE executables (VxD and blobs extracted from a DOS/4GW executable) are detected as MZ.

In both cases:

  • e_lfanew points to the LE header information block
  • size as calculated from e_cp and e_cblp is too short for the payload (if that is taken into consideration)

msbit avatar Sep 10 '22 07:09 msbit