New static relocation type: R_AARCH64_INST32
R_AARCH64_INST32 is a static relocation type that will be used to implement deactivation symbols. The proposed semantics shall be as follows:
If S is a defined symbol, write bits [31:0] of X at byte-aligned place P. Check that -2^31 <= X < 2^32. Otherwise, P is unmodified.
A proposed implementation in LLD is available here.
Also proposing that the new relocation type would always write a little-endian value regardless of the target endianness because A64 instructions are always little-endian.
Thanks for raising. I've only skimmed through the proposals on the LLVM Discourse at the moment, will need to read it through in more detail to make sure I understand the details; I think the https://discourse.llvm.org/t/rfc-deactivation-symbols/85556 is the best place for me to comment on the non AArch64 specific parts.
My initial reaction, will post on Discourse after reading in more detail:
- If the relocation is an instruction rather than data relocation, the little-endianness should drop out from that.
- Ideally this relocation would apply only to instructions without static relocations. Otherwise there would need to be restrictions on code-generators about ordering of relocations (currently a linker can't assume that instructions are in order of ascending r_offset, even if in practice they are). ELF also has some strange rules for composition of relocations at the same r_offset (although I've not seen any linker implement) https://www.sco.com/developers/gabi/latest/ch4.reloc.html "multiple consecutive relocations"
- To avoid name clashes with user defined symbols, for compiler generated use cases we should standardise the symbol name.
- The relocation is more powerful/flexible than it needs to be for the use case. For example if it is to be used to replace an instruction with a NOP, then at least for AArch64 we don't need to use the symbol value. It could just be symbol defined then write NOP, otherwise ignore. Just thinking that someone is bound to (ab)use this in assembly and run into trouble if they do something the tools aren't expecting.
If the relocation is an instruction rather than data relocation, the little-endianness should drop out from that.
Ack
Ideally this relocation would apply only to instructions without static relocations. Otherwise there would need to be restrictions on code-generators about ordering of relocations (currently a linker can't assume that instructions are in order of ascending r_offset, even if in practice they are). ELF also has some strange rules for composition of relocations at the same r_offset (although I've not seen any linker implement) https://www.sco.com/developers/gabi/latest/ch4.reloc.html "multiple consecutive relocations"
This does need to appear on instructions with a static relocation, namely a BL instruction that calls the Emulated PAC runtime function with a CALL26 relocation. I did need to ensure that INST32 appears last, otherwise the CALL26 would overwrite the NOP with a BL.
I note the following from the gABI:
An ABI processor supplement may specify individual relocation types that always stop a composition sequence, or always start a new one.
Since the defacto state of the world is that all relocations are applied independently (in the order that they appear), I think we could specify that all relocation types always start and stop composition sequences.
Alternatively if we wanted to do something more targeted we could say that INST32 ignores the addend and then the composition rules don't matter as long as INST32 appears last.
To avoid name clashes with user defined symbols, for compiler generated use cases we should standardise the symbol name.
In general, I would expect non-STB_LOCAL compiler generated symbol names to be prefixed with __ or _ followed by a capital letter (i.e. C's rule for reserved identifiers). That goes for not just this feature but e.g. for compiler generated function symbols. So I'm not sure that we need to specify anything regarding symbol names for this feature in particular.
FWIW, for the pointer field protection use case I'm using symbols named __pfp_ds_<Itanium encoded struct name>.<field name>.
The relocation is more powerful/flexible than it needs to be for the use case. For example if it is to be used to replace an instruction with a NOP, then at least for AArch64 we don't need to use the symbol value. It could just be symbol defined then write NOP, otherwise ignore. Just thinking that someone is bound to (ab)use this in assembly and run into trouble if they do something the tools aren't expecting.
I'm not sure that we need to be so restrictive with this feature. People writing hand-written assembly are, for better or worse, generally assumed to know what they are doing, and incorrect use of this feature is just one of the many ways that they can mess things up. If we restrict to NOP and a future developer has a legitimate non-NOP use case they would need to create yet another relocation type and change all the linkers again and their object files would only be compatible with new linkers, whereas if we allow any instruction there's at least a chance that they'll be able to make their use case work on older linkers.
Just thinking that someone is bound to (ab)use this in assembly
I was already trying to imagine what else you could use this kind of feature for!
A use case that springs to mind is controlling which thread pointer register a multithreaded libc uses, without having to either recompile it from source or indirect all TP accesses via a replaceable subroutine (incurring a function call overhead).
With the full-powered version of this relocation, you could generate code in which every access to a thread register looks like, say, msr x5, tpidr_el0 (or whatever register you want to have as the default), plus an INST32 relocation to a symbol name __load_tp_x5. (There'd have to be a family of those symbols, one for each target register of the msr.) Then a client application could rewrite all the instructions at once to talk to some other thread register, by defining all those symbols appropriately.
A use case that springs to mind is controlling which thread pointer register a multithreaded libc uses, without having to either recompile it from source or indirect all TP accesses via a replaceable subroutine (incurring a function call overhead).
My first thought on the relocation name was to suggest R_AARCH64_PATCHINST
My concern about a powerful relocation, is that it might end up constraining, or at least making more complex, some linker transformations. For example the relocation optimization in https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#relocation-optimization and even worse things like the Errata patch fixes https://github.com/llvm/llvm-project/blob/main/lld/ELF/AArch64ErrataFix.cpp
Assuming the correctness of the program dependended on the R_AARCH64_INST then the linker would have to account for the presence of these relocations before making any change. While I'd need to think it through, a more narrow relocation with a specific meaning does give the linker more of a chance to safely apply or not-apply the transformation.
If we do end up with a powerful relocation, then we'll most likely have to think about what the contract between the person using the relocation and the static linker is. At one extreme it could be use this relocation at your own risk outside of known use cases like deactivation symbols.
For now I think it is fine to mentally reserve the relocation code used in the patch. When the upstream discussion gets to the point of accepting the RFC and transitioning to reviewing the patches, I can reserve the relocation code and start a PR with the description.
Assuming the correctness of the program dependended on the R_AARCH64_INST then the linker would have to account for the presence of these relocations before making any change. While I'd need to think it through, a more narrow relocation with a specific meaning does give the linker more of a chance to safely apply or not-apply the transformation.
If we do end up with a powerful relocation, then we'll most likely have to think about what the contract between the person using the relocation and the static linker is. At one extreme it could be use this relocation at your own risk outside of known use cases like deactivation symbols.
Thinking about it more, this makes sense to me. We could make it so that the restriction to known use cases only exists in the spec and linkers would not be expected to enforce it. That way, supporting new use cases would only involve a spec change and existing linkers would likely support them retroactively. If an erratum fix is needed, the fix would only need to account for use cases that are blessed by the spec.
@smithp35 would you like to upload the PR for the psABI changes here? I could do that if not.
I'll make a PR tomorrow, I don't think I'll get it done today.
To follow up on a comment in https://github.com/ARM-software/abi-aa/pull/336 I'm thinking it would be useful to make a new section and table like https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#5712relocations-for-pauth-abi-extension with a title like Relocations for structure protection extension. That way I can mark the section as Alpha so we've got scope to make breaking changes, and alleviate any potential concerns from the GNU team about binutils support.
As a PSA I'll be out of the office on holiday for a couple of weeks + 1 day starting on Friday. I'll likely be monitoring email through some of that, but I may be a lot slower to respond.
https://github.com/ARM-software/abi-aa/pull/340 to cover both new relocations. Hopefully managed to capture the discussion so far.
Added in #340 (as R_AARCH64_PATCHINST).