fdpp icon indicating copy to clipboard operation
fdpp copied to clipboard

support R_386_SEGRELATIVE relocations

Open stsp opened this issue 4 years ago • 26 comments

I queried the nasm list about R_386_COPY or R_386_SEGRELATIVE relocations: https://forum.nasm.us/index.php?topic=2747.0 but there was no reply.

Whether or not does this mean that we'll have to implement this in nasm on our own, is unclear but the relocations are the whole point of fdpp. Looking into yasm might be interesting too.

@tkchia what was the deal with R_386_SEGRELATIVE relocations? They are mentioned in your doc as a part of Anvin's segelf, but are not in nasm. Are they anywhere?

stsp avatar Mar 11 '21 12:03 stsp

Hello @stsp,

R_386_SEGRELATIVE is part of Anvin's segelf proposal (which is available). For some reason, it seems that Anvin has yet to fully implement the proposed relocations in nasm --- though he did create a separate Git branch to work on this a while back.

In the meantime, yes, my binutils-ia16 fork does have support for R_386_SEGRELATIVE and the other proposed segelf relocations. Among other things, ia16-elf-as knows how to output R_386_SEG16 relocations --- even if nasm does not yet do so --- and the gold linker ia16-elf-ld.gold knows how to transform them into R_386_SEGRELATIVE relocations.

Thank you!

tkchia avatar Mar 11 '21 13:03 tkchia

Hello @stsp,

Anyway, one difficulty in implementing the segelf relocation scheme --- whether in nasm or yasm --- is that the assembler needs to be able to automagically emit a bunch of additional symbols and segments (foo!, .data!, etc.), for the whole scheme to really work.

Thank you!

tkchia avatar Mar 11 '21 13:03 tkchia

Thanks for info! That's really helpful. I built the experimental nasm but I don't know how to emit the relative relocation. readelf -r kernel.o doesn't show either one is emitted. So how to do that?

whole scheme to really work.

But I don't need the whole scheme because I have converted the linkage to tiny model (and re-link it back at run-time). All I need is to "unhardcode" those TGROUPs, turning them into either a copy or segrelative relocs. Can I do that?

stsp avatar Mar 11 '21 13:03 stsp

Hello @stsp,

I built the experimental nasm but I don't know how to emit the relative relocation. readelf -r kernel.o doesn't show either one is emitted. So how to do that?

I think Anvin's implementation is not yet complete. In any case, I think you will most probably have to check with him on how to proceed...

But I don't need the whole scheme because I have converted to linkage to tiny model (and re-link it back at run-time). All I need is to "unhardcode" those TGROUPs, turning them into either a copy ot segrelative relocs. Can I do that?

What exactly are the kind of ELF files you want to build? How will they figure in the whole dosemu2 + fdpp system? Will you need to build them from nasm assembly sources?

Thank you!

tkchia avatar Mar 11 '21 14:03 tkchia

What exactly are the kind of ELF files you want to build?

I already do: I switched fdpp to elf recently. Its just a normal ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked.

How will they figure in the whole dosemu2 + fdpp system?

By the custom elf loader: https://github.com/dosemu2/fdpp/blob/master/fdpp/elf.c

Will you need to build them from nasm assembly sources?

Yes, the asm sources are not supposed to be modified (except maybe the mods needed to emit the needed reloc).

stsp avatar Mar 11 '21 14:03 stsp

Hello @stsp,

I think a stop-gap solution --- until nasm really gets segelf support --- might be to just leave TGROUP as a "normal" kind of undefined symbol.

You can see that, when building the stock FreeDOS kernel with ia16-elf-gcc, the symbols TGROUP, DGROUP, etc. are simply normal external symbols (extern TGROUP etc.) which are defined in kernel.ld. If you do not define (!) these symbols, and if you do a partial link (-Wl,-r), then I expect that the references to the undefined TGROUP etc. will appear in the output .elf as good old R_386_16 relocations.

(And R_386_16 is well-supported by mainline nasm and binutils.)

This is probably not ideal, but it should be workable.

Thank you!

tkchia avatar Mar 11 '21 16:03 tkchia

I tried partial linking: it produces hundreds of relocations besides TGROUP. Not sure why that is the case. Also it seems to change the ELF type from executable to relocatable.

stsp avatar Mar 11 '21 18:03 stsp

Basically I think partial linking doesn't resolve any relocations at all. Which is then not what we need.

stsp avatar Mar 11 '21 18:03 stsp

until nasm really gets segelf support

What are the pre-requisites for segrelative? I remember that HPA wanted something from ia16-binutils for a full thing, but just for segrelative I don't suppose anything special is needed? Or is it exactly that nasm can only emit R_386_SEG16 and it needs binutils to "translate" that to R_386_SEGRELATIVE? Why it can't just emit R_386_SEGRELATIVE?

stsp avatar Mar 11 '21 20:03 stsp

Hello @stsp,

I tried partial linking: it produces hundreds of relocations besides TGROUP. Not sure why that is the case.

Oops. It seems -r is much more conservative than I thought.

I remember that HPA wanted something from ia16-binutils for a full thing, but just for segrelative I don't suppose anything special is needed?

I guess you will need to clarify with Mr. Anvin about that --- as he is the maintainer of nasm.

Thank you!

tkchia avatar Mar 12 '21 13:03 tkchia

But the idea is not too bad. It seems I can arrange the link script for solib, which does essentially what you proposed.

stsp avatar Mar 12 '21 13:03 stsp

Hello @stsp,

It seems I can arrange the link script for solib, which does essentially what you proposed.

Thank you --- this is interesting. Does it actually produce relocations in the output file? What option(s) did you use? I think I also tried -shared instead of -r, but for some reason I do not see any relocations for TGROUP etc.

Thank you!

tkchia avatar Mar 12 '21 13:03 tkchia

No relocations - just an undefined symbol. I am not sure why it is so but I think it doesn't matter for my needs. Perhaps undefined symbol and the COPY relocation is the same thing? I really don't know the difference.

stsp avatar Mar 12 '21 13:03 stsp

Hello @stsp,

It is probably best to check that the output fdppkrnl.elf file does actually have the needed relocations --- readelf -r or readelf -d should give something.

At least on my setup, it seems that the linker is just silently dropping the undefined symbols (and effectively giving them a value of 0), i.e. it is not working.

Also, I think it is best that you clarify with Mr. Anvin on what feature(s) you would like to see in nasm, since, again, he is the maintainer after all. I cannot speak for him.

Thank you!

tkchia avatar Mar 12 '21 13:03 tkchia

Hello @stsp,

Also, R_386_COPY is described here. It really means something very different from a good old R_386_16.

Thank you!

tkchia avatar Mar 12 '21 13:03 tkchia

It is probably best to check that the output fdppkrnl.elf file does actually have the needed relocations --- readelf -r or readelf -d should give something.

That unfortunately doesn't give anything, I already checked.

At least on my setup, it seems that the linker is just silently dropping the undefined symbols (and effectively giving them a value of 0), i.e. it is not working.

That's because (in case you tried on fdpp) the dynamic sections are dropped by the script. I now have the linker script suitable for solib, but its not yet pushed. I will remove DISCARD completely, but the special care needs to be taken for some sections to go before PSP (which is what linker does by default, so some adjustments were needed).

Also, I think it is best that you clarify with Mr. Anvin on what feature(s) you would like to see in nasm

He didn't reply to the forum. Of course it would be good to get segrelative working, but I think I can get the very similar results with solib.

stsp avatar Mar 12 '21 14:03 stsp

OK, I pushed what I currently have, into a separate branch. You can see that solib is created properly, but writing a dynamic loader is a big task. :(

stsp avatar Mar 12 '21 14:03 stsp

Does anyone have a simple dlopen() examples to steal the code from? :) I am not going to write a dynamic linker...

stsp avatar Mar 12 '21 14:03 stsp

My current theory of what's going on, is this:

  • No matter whether we link exec or solib, we will not see the relocations there. That's simply because R_386_16 (among others) is a link-time reloc, not a run-time one. So the linker will resolve them all, no matter what. In case of solib, it just produces the list of undefined symbols, and in case of an exec that would result into an error.
  • If nasm emits some run-time reloc like R_386_SEGRELATIVE, then we will see it in exec elf. But not in solib: for solib we will simply get the "text relocation found, please recompile with -fPIC" error because solib would probably only support the GOT relocs.

So if we now go for solib and write a dynamic linker (who will?), then switching to segrelative will be difficult when it is there, as we will have to abandon solib and dynamic linker all together.

stsp avatar Mar 12 '21 15:03 stsp

I committed the linker script arrangements alone so that you can play around solib now and it will work: you can boot dosemu from it, or you can make the DGROUP symbols undefined. But not both...

stsp avatar Mar 12 '21 15:03 stsp

And also I think the solib approach is actually broken. IIRC normally text relocations to undefined symbols are bound to local bss. Later you have a COPY reloc to move the global symbol from solib to bss, but the text itself is never "patched" when undefined symbols are resolved. So I suspect there is a bug in GNU ld: it silently resolves the text relocs to undefined symbols, writing zero there. But lld says this:

ld.lld: error: can't create dynamic relocation R_386_16 against local symbol in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output

And if you pass -z notext then it resolves no relocations at all, like ld -r does.

So... it seems the ability to create solib that way is just a GNU ld bug.

stsp avatar Mar 12 '21 16:03 stsp

I think its a bug, because of 2 things:

  • It doesn't even warn about an undefined global symbols in text segment.
  • It doesn't seem to propagate the symbol size to symtab: I always see 0 for undefined symbols. Maybe it can't because we need not the size of the symbol but rather the size of the reference to symbol (which is 16 bits when the reloc was R_386_16). So maybe the size is stored somewhere else, I just need to find it in another section...

The correct lld error is this:

ld.lld: error: can't create dynamic relocation R_386_16 against symbol: IGROUP in readonly segment;
recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output

GNU ld doesn't emit one, but if it stores the original reloc size somewhere, then at least this can work.

stsp avatar Mar 12 '21 16:03 stsp

And if I link the x86_64 solib, then I (correctly) get this for all relocs smaller than the native word size:

ld: kernel.o: relocation R_X86_64_16 against `INIT_TEXT' can not be used when making a shared object;
recompile with -fPIC

So I really think it can't store the reloc size anywhere. And what it does on i386 arch is unclear.

stsp avatar Mar 12 '21 16:03 stsp

https://reviews.llvm.org/D63121 This seems to confirm my theory that you can't leave the "small" relocations to run-time in a form of undefined symbols. At least not on x86_64. Why GNU ld allows that for i386, is something to find out later.

stsp avatar Mar 12 '21 17:03 stsp

fdpp: we'll find work-around everywhere. :) Instead of implementing the dynamic linker (that would depend on GNU ld misbehaviour and all that), I did a simple trick: I link the kernel twice with different segments, then compare the resulting binaries and create a "relocation table" of some sort. A trivial hack saves the world again. :)

stsp avatar Mar 14 '21 22:03 stsp

Hi @hpax how difficult would it be to support R_386_SEGRELATIVE in nasm? Currently I have to create my own relocation table by linking the object twice and diffing the resulting binaries. Which is not the best hack in the world.

stsp avatar Mar 23 '21 13:03 stsp

Hi @hpax So I have downloaded your segelf binutils fork and your WIP nasm elf16 branch. It all works! I have ported my code to your elf16 branch without a single issue! The patches are referenced in this ticket. So... what's the deal? Why not to upstream the code, if it all already works?

Note that, contrary to what is advertised, the ia16-elf-ld just segfaults on these new relocs. While your code works perfectly, even if you never said it should. :)

stsp avatar Oct 12 '23 19:10 stsp

Maybe let @tkchia know?

andrewbird avatar Oct 12 '23 20:10 andrewbird

It doesn't matter since I am not going to use ia16-elf-ld, and also nasm is the most important piece of the puzzle anyway.

If nothing else, I can make nasm and binutils forks a submodules of fdpp... Though it will take quite a while to build it then. :) And overall that would be a maintenance nightmare.

stsp avatar Oct 12 '23 20:10 stsp

I am not going to use ia16-elf-ld

... nor ld. We need this in llvm's lld, sigh. Another 10 years to wait. :(

stsp avatar Oct 12 '23 20:10 stsp