gcc-ia16 icon indicating copy to clipboard operation
gcc-ia16 copied to clipboard

Is there any way to avoid the linker offsetting the VMA by the size of the header in .EXE files?

Open andrewbird opened this issue 4 years ago • 58 comments

I was looking at the FreeDOS kernel compilation with gcc-ia16 last week and noticed that it uses the same style of linker script as gcc-ia16 also uses to produce a .EXE file. After some discussion near the end in this PR https://github.com/dosemu2/dosemu2/pull/1540 it seems that the VMA addresses in the .MAP file are offset by the size of the .EXE header. This means that when loading the .MAP file into Dosemu's dosdebug the origin is not as expected 0x0060, but 0x0055 since the header is 0xb long. This is not such a problem with the FreeDOS kernel as we just have to know that the Watcom .MAP file is loaded at 0x0060 and the GCC ia16 .MAP file is to be loaded at 0x0055 and whilst not intuitive it's not a great problem. However when debugging applications it's nice to be able to load the map file using the CS register as the origin, but in the GCC ia16 case we have to subtract 0xb from CS. Is there any way for GCC-ia16 to generate the .EXE header outside of the linker script, or otherwise avoid offsetting the VMA addresses in the linker script?

Thank you!

andrewbird avatar Sep 27 '21 17:09 andrewbird

Hello @andrewbird,

Well, yeah — the whole handling of segments in the gcc-ia16 toolchain is rather hacky, even if it happens to work.

I think the whole thing can really use a huge overhaul, but the problem is, the path from here to there will be messy. Recently, for the ELKS target platform, I had modified the toolchain to

  • use H. Peter Anvin's more powerful and flexible segelf method for representing segment:offset addresses (see my discussion); and
  • employ a separate elf2elks program to generate ELKS a.out binaries from ELF files, rather than doing everything in a linker script.

It should be possible to do something similar for the MS-DOS target — transition to the segelf scheme, and write a separate elf2mz (?) program to generate MZ binaries. But we will most probably need buy-in from at least the FreeDOS kernel and FreeCOM projects, since their linker scripts currently rely on the "old" linker internals. Another wrinkle is that — as far as I know — Anvin has for some reason not yet implemented the segelf scheme in nasm. (Support has been added into ia16-elf-as, though.)

Let me know your thoughts.

Thank you!

tkchia avatar Sep 27 '21 17:09 tkchia

A most interesting read! I'm happy to help, where able, with build and testing. Regarding the FreeDOS kernel and FreeCOM, there's not really much happening apart from that being done by @perditionc and of course @bartoldeman. I don't know of any GCC ia16 kernel build being used outside of the GitHub Actions test for buildabilty and my own Dosemu2 testing suite. Is it possible though to prove this out on standalone .EXE binaries first without breaking what's being used for the kernel and freecom? Thank you!

andrewbird avatar Sep 27 '21 19:09 andrewbird

I have no issue with updating the kernel and command.com build as needed, provided there is guidance and/or help with what needs to done.

PerditionC avatar Sep 27 '21 23:09 PerditionC

Looking at https://github.com/jbruchon/elks/blob/4cf198434b9/elks/tools/elf2elks/elf2elks.c I think I could probably write something similar to produce an MZ executable, though I'd be starting from a zero understanding of the segelf scheme.

Anvin has for some reason not yet implemented the segelf scheme in nasm. (Support has been added into ia16-elf-as, though.)

NASM seems to be the really big one, primarily because you've already done what's required to the ia16 tools, without it though I doubt this scheme can be very useful for building the FreeDOS kernel, FreeCOM, or many of the FreeDOS tools. I'm not sure what the next step for integration into NASM should be as their mailing list and bug trackers seem to be very low traffic. I did see your issue https://bugzilla.nasm.us/show_bug.cgi?id=3392533 but unfortunately the conversation seems to have stalled. There is some sporadic unrelated commit activity at https://github.com/netwide-assembler/nasm/commits/master so people are still working on the project. Perhaps developer time is very limited at the moment.

Thank you!

andrewbird avatar Sep 28 '21 10:09 andrewbird

@hpax I wonder if you've made any progress towards the SEGELF support in nasm, as we could really use it now?

Thanks, Andrew.

andrewbird avatar Oct 06 '21 22:10 andrewbird

Just so we have the link to hand I found that Stas was enquiring on the NASM forum recently as well. https://forum.nasm.us/index.php?topic=2747.0

andrewbird avatar Oct 06 '21 22:10 andrewbird

@tkchia whilst we are waiting for a response from Peter on the SEGELF support in NASM, I was wondering if it was possible to switch to .exe production for the kernel using the elf_i386_msdos_mz scheme? What was missing from that, were some fields in the header just incorrect (and maybe we could correct them with a post-process with info from the .map file), or something more fundamental?

andrewbird avatar Oct 07 '21 22:10 andrewbird

I was looking at the NASM code. To cooperate with your segelf support do you think that the only required changes to NASM will be in the ELF output stage? I found this test program within https://bugzilla.nasm.us/show_bug.cgi?id=3392694 written by @ecm-pushbx. I have zero experience with NASM development, but do you think a new segelf target is the best option or perhaps you favour modifying the existing elf backend?

andrewbird avatar Oct 22 '21 23:10 andrewbird

Hi, my username on here is actually @ecm-pushbx but yes I made that test case to lay out what NASM needs. I believe the segelf extensions were supposed to be added to the elf format.

ecm-pushbx avatar Oct 23 '21 01:10 ecm-pushbx

Hi @ecm-pushbx, yes I now realise your proper username is not what I typed. I guess my lapse is a consequence of Github automatically filling it in for me most of the time. Thanks for creating the test case, it'll certainly be most useful for me. I'll have a stab at modifying the ELF output, not sure how far I'll get with my limited knowledge, but let's see.

Thank you!

andrewbird avatar Oct 23 '21 09:10 andrewbird

@tkchia I thought I'd do a few experiments around writing the MZ EXE header differently. Before I start I want to create a simple test with ia16-elf-gcc and understand the current output. Here's my little attempt https://github.com/andrewbird/test-exe. When I compile/link it and examine the header I'm seeing something unexpected (to me anyway), that is the stack segment @ 0x10548. Here's the header in the .MAP file

.msdos_mz_hdr   0x0000000000000000       0x20                                   
                0x0000000000000000                __msdos_mz_hdr_start = .      
                0x0000000000000000        0x2 SHORT 0x5a4d                      
                0x0000000000000002        0x2 SHORT 0x20 ((LOADADDR (.data) + SIZEOF (.data)) % 0x200)
                0x0000000000000004        0x2 SHORT 0x30 (((LOADADDR (.data) + SIZEOF (.data)) + 0x1ff) / 0x200)
                0x0000000000000006        0x2 SHORT 0x1 __msdos_mz_rels         
                0x0000000000000008        0x2 SHORT 0x2 __msdos_mz_hdr_paras    
                0x000000000000000a        0x2 SHORT 0xf68 (((0x10000 - SIZEOF (.data)) - ADDR (.data)) / 0x10)
                0x000000000000000c        0x2 SHORT 0xf68 DEFINED (__msdos_handle_v1)?0xffff:(((0x10000 - SIZEOF (.data)) - ADDR (.data)) / 0x10)
                0x000000000000000e        0x2 SHORT 0x10548 ((((LOADADDR (.data) / 0x10) - __msdos_mz_hdr_paras) - (ADDR (.data) / 0x10)) + 0x10000)
                0x0000000000000010        0x2 SHORT 0x0                         
                0x0000000000000012        0x2 SHORT 0x0                         
                0x0000000000000014        0x2 SHORT 0x20 _start                 
                0x0000000000000016        0x2 SHORT 0xfffe ((((LOADADDR (.text) / 0x10) - __msdos_mz_hdr_paras) - (ADDR (.text) / 0x10)) + 0x10000)
                0x0000000000000018        0x2 SHORT 0x1c (__msdos_mz_rel_start - __msdos_mz_hdr_start)
                0x000000000000001a        0x2 SHORT 0x0                         
 *(.msdos_mz_hdr .msdos_mz_hdr.*)                                               
                0x000000000000001c                __msdos_mz_rel_start = .      
 *(.msdos_mz_reloc .msdos_mz_reloc.*)                                           
 .msdos_mz_reloc.0                                                              
                0x000000000000001c        0x4 /usr/lib/x86_64-linux-gnu/gcc/ia16-elf/6.3.0/../../../../../ia16-elf/lib/libdos-s.a(dos-msmmabort.o)
                0x0000000000000020                __msdos_mz_rel_end = .        
                0x0000000000000001                __msdos_mz_rels = ((. - __msdos_mz_rel_start) / 0x4)
                0x0000000000000020                . = DEFINED (__msdos_handle_v1)?ALIGN (0x200):.
                0x0000000000000002                __msdos_mz_hdr_paras = (((. - __msdos_mz_hdr_start) + 0xf) / 0x10)
                0x0000000000000020                . = ALIGN (0x10)              
                0x0000000000000001                ASSERT ((((__msdos_mz_rel_end - __msdos_mz_rel_start) % 0x4) == 0x0), Error: MZ relocations are not 4-byte aligned)
                0x0000000000000001                ASSERT ((__msdos_mz_rels <= 0xffff), Error: too many MZ relocations)
                                                                                
       

How can this be, as the field width is only 16 bits?

andrewbird avatar Jan 18 '22 15:01 andrewbird

Hello @andrewbird,

The value of 0x10548 will be truncated to 0x0548 in the final output. This should correspond to the relative paragraph offset of the .data segment in the output (which should be the same as the initial stack segment; the startup code later sets %ds from %ss).

Thank you!

tkchia avatar Jan 18 '22 16:01 tkchia

Hello @andrewbird,

(Incidentally, in case you are curious, __msdos_handle_v1 is a symbol that is defined if ia16-elf-gcc was asked (-mmsdos-handle-v1) to output an executable that fails gracefully on MS-DOS 1.x, rather than crash. This option will change the layout of the MZ header, among other things. My linker script source file comments have some further discussion on this.)

Thank you!

tkchia avatar Jan 18 '22 16:01 tkchia

Hi @tkchia, Ahh, I'd seen that the value was truncated in the actual header, but I hadn't realised it was intentional. Here's the output from my little header printer script.

$ ./prnhdr.py 
test-std.exe: MZ header OK!
  Bytes in last page:                 0x0020
  Number of pages (inc last):         0x0030
  Number of relocation entries:       0x0001
  Header size (paragraphs):           0x0002
  Min. Memory allocated (paragraphs): 0x0f68
  Max. Memory allocated (paragraphs): 0x0f68
  Initial Stack Segment:              0x0548
  Initial Stack Pointer:              0x0000
  Checksum (0 for none):              0x0000
  Initial Instruction Pointer:        0x0020
  Initial Code Segment:               0xfffe
  Offset of relocation table:         0x001c
  Overlay number:                     0x0000

So moving on to my creating a elf2mz program, what gcc/ld options should I use to create a suitable input elf file?

Thank you!

andrewbird avatar Jan 18 '22 16:01 andrewbird

Hello @andrewbird,

For now I guess you can try to force the output format to elf32-i386 instead of binary, by using a -Wl,--oformat=elf32-i386 option. This should yield a ELF file which will include the MZ header (as program data), the various ELF section headers and program headers, etc. You can try to poke around the ELF output to decide what to do next.

Thank you!

tkchia avatar Jan 18 '22 17:01 tkchia

Note that — as I mentioned previously — as of now, the linking of .exe files still uses the LMA ≠ VMA representation scheme, not Anvin's segelf.

Thank you!

tkchia avatar Jan 18 '22 17:01 tkchia

Hello @tkchia,

For now I guess you can try to force the output format to elf32-i386 instead of binary, by using a -Wl,--oformat=elf32-i386 option. This should yield a ELF file which will include the MZ header (as program data), the various ELF section headers and program headers, etc. You can try to poke around the ELF output to decide what to do next.

Thanks, that's exactly what I need!

Note that — as I mentioned previously — as of now, the linking of .exe files still uses the LMA ≠ VMA representation scheme, not Anvin's segelf.

I had been thinking that I might have to turn that on with -msegelf, but then there's the problem with no nasm support. If something is achievable as is, then I'll probably stick with that.

Thank you!

andrewbird avatar Jan 18 '22 17:01 andrewbird

Hello @andrewbird,

(Incidentally, in case you are curious, __msdos_handle_v1 is a symbol that is defined if ia16-elf-gcc was asked (-mmsdos-handle-v1) to output an executable that fails gracefully on MS-DOS 1.x, rather than crash. This option will change the layout of the MZ header, among other things. My linker script source file comments have some further discussion on this.)

Thank you!

Huh, didn't know that MS-DOS 1.xx had any support for MZ executables.

ecm-pushbx avatar Jan 18 '22 17:01 ecm-pushbx

Hello @ecm-pushbx,

Huh, didn't know that MS-DOS 1.xx had any support for MZ executables.

The code for handling MZ files was in command.com, rather than the kernel. And yes, the support was pretty crappy. :neutral_face:

Thank you!

tkchia avatar Jan 18 '22 17:01 tkchia

At least now MS DOS 1.25 could be fixed like how you fixed GW-BASIC? :D

https://github.com/microsoft/MS-DOS

lpsantil avatar Jan 19 '22 03:01 lpsantil

Hello @andrewbird,

For now I guess you can try to force the output format to elf32-i386 instead of binary, by using a -Wl,--oformat=elf32-i386 option.

Another tip: if in addition you say -Wl,-r, the linker (ia16-elf-ld) will not try to resolve relocations, but will instead leave them around as ELF relocations in the output file. You can then study them with ia16-elf-objdump -D -r ... (e.g.) or ia16-elf-readelf -r ...

Hello @lpsantil,

At least now MS DOS 1.25 could be fixed like how you fixed GW-BASIC? :D

Well, not sure there is much point in doing that though. :no_mouth:

Thank you!

tkchia avatar Jan 19 '22 17:01 tkchia

Hello @tkchia, Thanks for the info, I'm sure it will be most useful. I'm currently thinking of doing the calculations in the linker script just as you do now. But instead of writing the header from there, I hope to set some new private variables with the values that I can read in elf2mz and so write out header. That way I hope to keep the logic in the linker script instead of splitting it between linker script and elf2mz. I may have to start again with elf2mz at some point, as for now I switched to -msegelf purely because the linker script was easier for me to understand! I know we are waiting for NASM support to make elf2mz really useful, but I figured it would be interesting (for me at least) to see how far I could go with this.

Thank you!

andrewbird avatar Jan 19 '22 18:01 andrewbird

Hello @tkchia, I switched back to the lma != vma method rather than -msegelf. Can you tell me which linker script is in operation when I issue this command, please?

ia16-elf-gcc -Wall -mcmodel=small -Os -o $@ $< -li86 -Wl,-Map=test-std.map

I tried /usr/ia16-elf/lib/dos-exe-small.ld but I'm not seeing the same sizes / offsets when using the two commands

ia16-elf-gcc -Wall -mcmodel=small -Os -o  test-new.o -c $<
ia16-elf-gcc -o test-new.elf test-new.o -T test-new.ld -li86 -Wl,-Map=test-new.map -Wl,--oformat=elf32-i386

Thank you!

andrewbird avatar Jan 21 '22 13:01 andrewbird

Hello, @tkchia, Sorry to keep spamming you! I have now figured out that I am using the right linker script for a base.

Thank you!

andrewbird avatar Jan 21 '22 13:01 andrewbird

Hello @tkchia,

I don't understand why the functions being linked in are different according to which output is used e.g.

std compile/link in one operation
.text          0x0000000000000244     0x114e /usr/lib/x86_64-linux-gnu/gcc/ia16-elf/6.3.0/../../../../../ia16-elf/lib/libc.a(lib_a-vfiprintf.o)
               0x0000000000000244                _vfiprintf_r                  
               0x000000000000137a                vfiprintf                     

and

compile, then link
.text          0x0000000000000221     0x2278 /usr/lib/x86_64-linux-gnu/gcc/ia16-elf/6.3.0/../../../../../ia16-elf/lib/libc.a(lib_a-vfprintf.o)
               0x0000000000000221                _vfprintf_r                   
               0x0000000000002481                vfprintf                      

My latest attempt is here, in case anyone can spot my mistake https://github.com/andrewbird/test-exe/tree/vma-ne-lma

At present it seems to me that my separate compile then link operations are not equivalent to the single operation. Until they are my efforts to compare the header values are worthless.

Thank you!

andrewbird avatar Jan 21 '22 14:01 andrewbird

Hello @andrewbird,

OK — the problem (?) lies in the -T option you are using in the second link.

gcc-ia16 (together with newlib-ia16 and binutils-ia16) has some special hacks for automatically detecting whether the program needs a floating-point-capable stdio or can do with a non-floating-point stdio (-mnewlib-autofloat-stdio).

This feature partly relies on gcc-ia16 inserting an additional linker script (actually, via an -lastdio option — where the .a file is a script). This works, but is obviously kind of messy. :neutral_face: At the moment, if you explicitly specify a linker script via -T, then gcc-ia16 will assume that you want to use just that linker script, and will forgo the special hacks.

I guess one way to get around this is to also explicitly specify a -T option in the first link (!). Maybe try something like

... -T "`ia16-elf-gcc --print-file-name=dos-mssl.ld`" ...

where the backquoted part will output the path to the default small model linker script.

Thank you!

tkchia avatar Jan 21 '22 17:01 tkchia

Hello @tkchia,

That really helped, so now the output is looking similar.

Thank you!

andrewbird avatar Jan 21 '22 19:01 andrewbird

Hello @tkchia,

I got a little further. Can you tell me how the segment value of the relocations is generated, as I seem to be missing it.

ia16-elf-gcc -Wall -mcmodel=small -Os -o test.o -c test.c
ia16-elf-gcc -Wall -mcmodel=small -o test-std.exe test.o -T "`ia16-elf-gcc --print-file-name=dos-mssl.ld`" -li86 -Wl,-Map=test-std.map
gcc -o elf2mz elf2mz.c -lelf
ia16-elf-gcc -Wall -mcmodel=small -o test-new.elf test.o -T test-new.ld -li86 -Wl,-Map=test-new.map -Wl,--oformat=elf32-i386
./elf2mz -i test-new.elf -o test-new.exe  # options not parsed yet
./elf2mz: ELF section 0x1 -> text section
./elf2mz: 	virt. addr. 0, size 0x99e0, file offset 0x1000
./elf2mz: ELF section 0xc5 -> data section
./elf2mz: 	virt. addr. 0, size 0xa90, file offset 0xb000
./elf2mz: ELF section 0xc6 -> msdos_mz_tail section
./elf2mz: 	virt. addr. 0xa90, size 0x10, file offset 0xba90
./elf2mz: ELF section 0xc7 -> BSS section
./elf2mz: 	virt. addr. 0xaa0, size 0xc3be, file offset 0xbaa0
./elf2mz: ELF section 0xc8 -> symtab section
./elf2mz: 	virt. addr. 0, size 0x1ec0, file offset 0xbaa0
./elf2mz: 0 text reloc(s)., 0 far text reloc(s)., 0 data reloc(s).
./elf2mz: created temporary file `./JrDl13'
./prnhdr.py
test-std.exe: MZ header OK!
  Bytes in last page:                 0x00b0
  Number of pages (inc last):         0x0053
  Number of relocation entries:       0x0001
  Header size (paragraphs):           0x0002
  Min. Memory allocated (paragraphs): 0x0f53
  Max. Memory allocated (paragraphs): 0x0f53
  Initial Stack Segment:              0x099c
  Initial Stack Pointer:              0x0000
  Checksum (0 for none):              0x0000
  Initial Instruction Pointer:        0x0020
  Initial Code Segment:               0xfffe
  Offset of relocation table:         0x001c
  Overlay number:                     0x0000
Relocations:
  fffe:98e0
test-new.exe: MZ header OK!
  Bytes in last page:                 0x0070
  Number of pages (inc last):         0x0053
  Number of relocation entries:       0x0001
  Header size (paragraphs):           0x0002
  Min. Memory allocated (paragraphs): 0x0f57
  Max. Memory allocated (paragraphs): 0x0f57
  Initial Stack Segment:              0x099c
  Initial Stack Pointer:              0x0000
  Checksum (0 for none):              0x0000
  Initial Instruction Pointer:        0x0000
  Initial Code Segment:               0xfffe
  Offset of relocation table:         0x001c
  Overlay number:                     0x0000
Relocations:
  0000:98c0

Thank you!

andrewbird avatar Jan 21 '22 20:01 andrewbird

Hello @andrewbird,

Look at bfd_i386_elf_get_paragraph_distance (...) in bfd/elf32-i386.c in binutils-ia16.

Thank you!

tkchia avatar Jan 21 '22 20:01 tkchia

Hello @tkchia, So it look like my relocation segment became zero because I deleted the mz header section. I'll try just emptying it and see if it's happy with that. Thank you!

andrewbird avatar Jan 21 '22 21:01 andrewbird