static linkers (lld and GNU ld) out of sync with aaelf64 for GOT relocations with addends.
Raised due to LLVM issue https://github.com/llvm/llvm-project/issues/63418
Raising this as an ABI issue rather than a two separate toolchain issues as it may be simpler to amend the ABI than try and fix the tools.
When doing a :got: and :got_lo12: expression to a local symbol, the GNU and LLVM assemblers convert this to a GOT generating relocation to the section symbol + addend.
For example:
.global main
.type main, %function
main:
adrp x0, :got:x
ldr x0, [x0, #:got_lo12:x]
adrp x1, :got:y
ldr x1, [x1, #:got_lo12:y]
adrp x2, :got:z
ldr x2, [x2, #:got_lo12:z]
add x0, x1, x1
add x0, x0, x2
ret
.size main, .-main
.bss
.align 3
.size x, 4
.size y, 8
.size z, 8
x:
.zero 4
y:
.zero 8
z:
.zero 8
has the following relocations.
Relocation section '.rela.text' at offset 0x128 contains 6 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000000 0000000500000137 R_AARCH64_ADR_GOT_PAGE 0000000000000000 .bss + 0
0000000000000004 0000000500000138 R_AARCH64_LD64_GOT_LO12_NC 0000000000000000 .bss + 0
0000000000000008 0000000500000137 R_AARCH64_ADR_GOT_PAGE 0000000000000000 .bss + 4
000000000000000c 0000000500000138 R_AARCH64_LD64_GOT_LO12_NC 0000000000000000 .bss + 4
0000000000000010 0000000500000137 R_AARCH64_ADR_GOT_PAGE 0000000000000000 .bss + c
0000000000000014 0000000500000138 R_AARCH64_LD64_GOT_LO12_NC 0000000000000000 .bss + c
The ABI https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#576static-aarch64-relocations search for GOT-relative instruction relocations
Defines these to be:
- Page(G(GDAT(S+A)))-Page(P)
- G(GDAT(S+A))
With
GDAT(S+A) represents a pointer-sized entry in the GOT for address S+A. The entry will be relocated at run time with relocation R_<CLS>_GLOB_DAT(S+A).
G(expr) is the address of the GOT entry for the expression expr.
GNU ld and lld are mishandling this with LLD using something like:
- Page(G((GDAT(S)) + A) - Page(P))
- G(GDAT(S)) + A With GNU ld.bfd appearing to ignore the addend A completely
We could say that this is a pair of toolchain bugs, however for LLD in particular some work would need to be done to generate separate GOT slots for S+A rather than just S. This might proove difficult to get accepted for a corner case just for AArch64. While I don't know for certain I expect GNU ld will have a similar problem.
It may be worth altering the ABI to not permit addends, or have them implemented in a way that GNU ld and LLD can both implement. We could then alter the assembler to not use symbol + addend forms for a local symbol.
If we determine that the ABI is correct we can raise toolchain issues and close this as won't fix.
When doing a :got: and :got_lo12: expression to a local symbol, the GNU and LLVM assemblers convert this to a GOT generating relocation to the section symbol + addend.
I consider this assembler issues specifically for AArch64. GNU assembler and LLVM integrated assembler for most other targets suppress local symbol to STT_SECTION conversion for GOT relocations.
This is neglected likely because compilers don't emit such constructs. https://github.com/llvm/llvm-project/issues/63418 is found due to inline assembly uses.
GNU ld and lld are mishandling this with LLD using something like:
Agreed that GNU ld and lld don't comply to the ABI when generating GOT entries. They create GOT entries based on the symbol, ignoring addend.
If we fix GNU assembler and LLVM integrated assembler to suppress STT_SECTION conversion (https://reviews.llvm.org/D158577 and https://sourceware.org/bugzilla/show_bug.cgi?id=30788), I believe this ABI change (GDAT(S+A) vs GDAT(S)) is moot, as all reasonable use cases will be unaffected.
ldr x1, [x1, :got_lo12:x]
ldr x1, [x1, :got_lo12:y]
// affected by this change but I do not recommend this use case.
// x86-64 maintainer considers a similar case reasonable: https://sourceware.org/bugzilla/show_bug.cgi?id=26939
ldr x1, [x1, :got_lo12:(x+8)]
// affected by this change but I do not recommend this use case.
ldr x1, [x1, :got_lo12:(.data+0)]
ldr x1, [x1, :got_lo12:(.data+8)]
.data
x:
.zero 4
y:
.zero 8
It may be worth altering the ABI to not permit addends, or have them implemented in a way that GNU ld and LLD can both implement. We could then alter the assembler to not use symbol + addend forms for a local symbol.
If we determine that the ABI is correct we can raise toolchain issues and close this as won't fix.
Switching from GDAT(S) to GDAT(S+A) would indeed be disruptive for lld so I hope that we use GDAT(S).
If we change assemblers as mentioned, whether or not the ABI requires GDAT(S) or GDAT(S+A) is largely moot for lld.
I've submitted https://github.com/ARM-software/abi-aa/pull/272 which removes A from the GDAT(S + A) relocation operation.