abi-aa icon indicating copy to clipboard operation
abi-aa copied to clipboard

static linkers (lld and GNU ld) out of sync with aaelf64 for GOT relocations with addends.

Open smithp35 opened this issue 1 year ago • 2 comments

Raised due to LLVM issue https://github.com/llvm/llvm-project/issues/63418

Raising this as an ABI issue rather than a two separate toolchain issues as it may be simpler to amend the ABI than try and fix the tools.

When doing a :got: and :got_lo12: expression to a local symbol, the GNU and LLVM assemblers convert this to a GOT generating relocation to the section symbol + addend.

For example:

        .global main
        .type   main, %function
main:
        adrp    x0, :got:x
        ldr     x0, [x0, #:got_lo12:x]
        adrp    x1, :got:y
        ldr     x1, [x1, #:got_lo12:y]
        adrp    x2, :got:z
        ldr     x2, [x2, #:got_lo12:z]
        add     x0, x1, x1
        add     x0, x0, x2
        ret
        .size   main, .-main
        .bss
        .align  3
        .size   x, 4
        .size   y, 8
        .size   z, 8
x:
        .zero   4
y:
        .zero 8
z:
        .zero 8

has the following relocations.

Relocation section '.rela.text' at offset 0x128 contains 6 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000500000137 R_AARCH64_ADR_GOT_PAGE 0000000000000000 .bss + 0
0000000000000004  0000000500000138 R_AARCH64_LD64_GOT_LO12_NC 0000000000000000 .bss + 0
0000000000000008  0000000500000137 R_AARCH64_ADR_GOT_PAGE 0000000000000000 .bss + 4
000000000000000c  0000000500000138 R_AARCH64_LD64_GOT_LO12_NC 0000000000000000 .bss + 4
0000000000000010  0000000500000137 R_AARCH64_ADR_GOT_PAGE 0000000000000000 .bss + c
0000000000000014  0000000500000138 R_AARCH64_LD64_GOT_LO12_NC 0000000000000000 .bss + c

The ABI https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#576static-aarch64-relocations search for GOT-relative instruction relocations

Defines these to be:

  • Page(G(GDAT(S+A)))-Page(P)
  • G(GDAT(S+A))

With

GDAT(S+A) represents a pointer-sized entry in the GOT for address S+A. The entry will be relocated at run time with relocation R_<CLS>_GLOB_DAT(S+A).
G(expr) is the address of the GOT entry for the expression expr.

GNU ld and lld are mishandling this with LLD using something like:

  • Page(G((GDAT(S)) + A) - Page(P))
  • G(GDAT(S)) + A With GNU ld.bfd appearing to ignore the addend A completely

We could say that this is a pair of toolchain bugs, however for LLD in particular some work would need to be done to generate separate GOT slots for S+A rather than just S. This might proove difficult to get accepted for a corner case just for AArch64. While I don't know for certain I expect GNU ld will have a similar problem.

It may be worth altering the ABI to not permit addends, or have them implemented in a way that GNU ld and LLD can both implement. We could then alter the assembler to not use symbol + addend forms for a local symbol.

If we determine that the ABI is correct we can raise toolchain issues and close this as won't fix.

smithp35 avatar Aug 22 '23 13:08 smithp35