RFC: patchable-function-entry
This RFC proposes to add support for the patchable-function-entry family of flags and attributes from clang and gcc in order to support instrumentation and patching as done in the Linux kernel.
I have a candidate implementation and a MCP to cover just the compiler flag.
How much do you want to allow users to depend on this? I.e. is this a guarantee, or just a hint?
Asking because there might potentially be backends that can't/don't support this and it would be useful to document if they should ignore this, or error.
How much do you want to allow users to depend on this? I.e. is this a guarantee, or just a hint?
Asking because there might potentially be backends that can't/don't support this and it would be useful to document if they should ignore this, or error.
I think that probably the behavior should be to error if the backend doesn't support it and a non-zero padding has been specified. This would allow users to set #[patchable_function_entry()] or #[patchable_function_entry(prefix(0), entry(0))] on a function to indicate it shouldn't have extra nop padding when built with a -C patchable-function-entry, but still have it compile when -C patchable-function-entry is not passed but using a backend that doesn't support it without needing to cfg throughout the code.
Thanks for the suggestion; I've added it to the design.
Seems useful!
One thing I'd mention is that seeing #[patchable_function_entry()] on a function at first glance seems to me like it should be enabling "patchability", but since the defaults are prefix(0), entry(0), #[patchable_function_entry()] without either specified actually either does nothing, or disables "patchability" (depending if the compiler flag is in use) . Thus I might suggest that the attribute should require at least one argument to be passed; users can still explicitly specify #[patchable_function_entry(prefix(0), entry(0))] as you mentioned.
First of all, :+1: for adding this functionality.
Having both the command-line option and the attribute take two numeric parameters, but with different meanings, seems error-prone. I do appreciate that projects might not want to have to compute different numbers in their build systems for rustc, though.
Given that, I do think we should use some syntax that requires the codegen options to specify which parameters they're defining, as suggested in your alternatives section (but without allowing any bare numeric versions), allowing prefix, entry, and total, and requiring that the user specify at most two of those (or just one of prefix and entry if they want the other to be zero). I also think you should avoid having the word "entry" duplicated.
As for the attribute, bikeshedding slightly, I had the same concern as @zachs18 that the attribute with no parameters feels like it should mean patchable rather than unpatchable. I also think it seems awkward to duplicate the word "entry" in patchable_function_entry(entry(...)). And also, based on precedent in other attributes, I think these should be key-value pairs with = rather than using inner parens. Given all that, I think we should spell this:
#[patchable(prefix = N, entry = M)], where at least one ofprefixorentryis required#[unpatchable]as an alias for#[patchable(prefix = 0, entry = 0)]. (Both for simplicity and because I think projects will want an abbreviated version badly enough that they'll end up defining it as a macro if we don't offer one.)
@rfcbot merge
This seems like a reasonable use case to support, so I think we should do it.
I agree with @joshtriplett that the differing meanings between the CLI flag and the attribute will create confusion. We should do something to mitigate that.
@rfcbot concern difference between flag and attribute count meanings is confusing
Team member @tmandry has proposed to merge this. The next step is review by the rest of the tagged team members:
- [x] @Aaron1011
- [ ] @cjgillot
- [x] @compiler-errors
- [x] @davidtwco
- [x] @eddyb
- [x] @estebank
- [x] @jackh726
- [x] @joshtriplett
- [x] @lcnr
- [x] @matthewjasper
- [x] @michaelwoerister
- [x] @nagisa
- [x] @nikomatsakis
- [x] @oli-obk
- [x] @petrochenkov
- [x] @pnkfelix
- [ ] @scottmcm
- [x] @tmandry
- [x] @wesleywiser
Concerns:
- ~~difference between flag and attribute count meanings is confusing~~ resolved by https://github.com/rust-lang/rfcs/pull/3543#issuecomment-1918126510
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns. See this document for info about what commands tagged team members can give me.
@rustbot labels -I-lang-nominated
We discussed this in the T-lang triage meeting on 2023-12-20 and that resulted in the proposed FCP merge and also the concern that divergent meanings between the CLI flag and the attribute may create confusion.
Please renominate if there are questions or progress on addressing that concern that should be discussed by T-lang.
How does this interact with inlining? Isn't this entirely nonsensical if a function gets inlined? I'd a user expected to use #[inline(never)] strategically in the places they care about? Or what's the story here?
Updated the RFC to address some of the concerns mentioned. Main changes:
- Flag now requires explicit parameter naming
MetaListinstead ofNameValue#[patchable]/#[unpatchable]attribute suggestion taken#[patchable]must specify at least one key now.- Inlining is discussed (no change to the proposal on what to do though, other than a brief note in the documentation).
Since prefix and offset are the same, I think we should pick one term and use it in both places. Otherwise I would have to consult the docs every time I wanted to make sure I remembered the relationship between the two ways of specifying patchable entry.
I do agree that #[unpatchable] (or a similar spelling) is better than the normal attribute with no arguments to mean the same thing, though I would also be comfortable with a version that kept one attribute but required specifying explicit zeroes.
I'm not sure how I feel about #[patchable], and wish I'd read @joshtriplett's comment more closely the first time to respond to it. It seems like we are over-rotating somewhat on the ergonomics of a niche feature, and I would prefer to align the naming of the flag and attribute more closely.
One way of framing this is: "What do we call this feature?" "Patchable" seems pretty vague and high-level. It doesn't provide many "memory hooks" as to what it actually does. "Patchable function entry" is much better in that regard. "Patchable entry" is almost as good, given the context of being applied to a function.
On that note, I think the name nop_count provides a lot more of a memory hook than prefix and entry. I propose that something like prefix_nops and entry_nops would be better. In fact, I think those would be so clear that it would completely remove my concern with #[patchable].
Before I update the RFC again, let me put some responses inline to check things.
Since
prefixandoffsetare the same, I think we should pick one term and use it in both places. Otherwise I would have to consult the docs every time I wanted to make sure I remembered the relationship between the two ways of specifying patchable entry.
The logic here at the moment is that it is referred to as the offset in C/C++ toolchains, but offset would make no sense in the prefix/entry world where there is no nop_count.
I do agree that
#[unpatchable](or a similar spelling) is better than the normal attribute with no arguments to mean the same thing, though I would also be comfortable with a version that kept one attribute but required specifying explicit zeroes.I'm not sure how I feel about
#[patchable], and wish I'd read @joshtriplett's comment more closely the first time to respond to it. It seems like we are over-rotating somewhat on the ergonomics of a niche feature, and I would prefer to align the naming of the flag and attribute more closely.
I'm fine naming this either way. I don't think it will be written commonly, so I don't have strong feelings about an ergonomic name.
One way of framing this is: "What do we call this feature?" "Patchable" seems pretty vague and high-level. It doesn't provide many "memory hooks" as to what it actually does. "Patchable function entry" is much better in that regard. "Patchable entry" is almost as good, given the context of being applied to a function.
I called it patchable_function_entry to name it the identical thing as the C/C++ feature. The goal is that when someone searches "fpatchable-function-entry rust" they'll get documentation explaining how to make Rust match up with their existing C/C++ instrumentation.
On that note, I think the name
nop_countprovides a lot more of a memory hook thanprefixandentry. I propose that something likeprefix_nopsandentry_nopswould be better. In fact, I think those would be so clear that it would completely remove my concern with#[patchable]. Again, fine on bikeshedding names to just about anything. My expectation is that the primary workflow for a user of this feature is going to be:
- My C/C++ code uses
-fpatchable-function-entry, and the Rust functions aren't getting built right. - Ah, the flag in Rust is called
-C patchable-function-entry=floozle=m,glorble=n - Oh, I want to put a Rust function into my instrumentation, how do I annotate it? Oh,
#[unpatchable].
I think that even for users for whom this feature is extremely important, they will write each of these keywords somewhere between 1 and 3 times.
I think you're doing a good job of weighing the viewpoint of the user who writes the annotation (or flag), @maurer. I'm considering things more from the perspective of a reader of the code. The code will be read many more times than it is written.
Anyway, to that end, I'll make a concrete proposal from the current state of the RFC:
- Rename
prefixandentryin the attribute toprefix_nopsandentry_nops - Rename
offsetin the CLI flag toprefix_nops
The reason being that these are more self-explanatory names. We lose the property that offset is the exact name used in C/C++, but the fact that nop_count is already the same plus the self-explanatory (IMO) nature of prefix_nops means that it should hopefully be easy to map.
@rfcbot reviewed
I am generally aligned with @tmandry here, and I concur with @wesleywiser that we should try not to make monomorphization behavior hard-coded. I'm signalling as reviewed because I trust the thread to reach reasonable conclusions.
My general preferences are:
- more explicit (and fewer) names = good for niche features
- align with gcc/clang = good, people should be able to copy/paste, at least for the values if not necessarily the names of the fields
I'm of mixed minds about #[unpatchable], it seems mildly better to have a single attribute with explicit zeros or something like that, but I'm trying to put my finger on why. I guess it seems easier to document, grep for, etc. (Another option would be #[patchable_function_entry(no)] or some similar opt-out.)
Re-reading, I see @joshtriplett wrote...
(Both for simplicity and because I think projects will want an abbreviated version badly enough that they'll end up defining it as a macro if we don't offer one.)
That makes sense to me, it'd be useful to know how often "unpatchable" functions arise in practice in existing codebases. I had imagined this being a fairly rare thing.
Re-reading, I see @joshtriplett wrote...
(Both for simplicity and because I think projects will want an abbreviated version badly enough that they'll end up defining it as a macro if we don't offer one.)
That makes sense to me, it'd be useful to know how often "unpatchable" functions arise in practice in existing codebases. I had imagined this being a fairly rare thing.
To give you an idea of frequency, the way to do this in Linux in C is spelled notrace:
#define notrace __attribute__((patchable_function_entry(0, 0)))
notrace occurs 416 times in the latest Linux source (with the tools/ directory filtered out, which references it a number of times as a field name, command name, or in documentation rather than referencing the macro, and spaces added to avoid situations where notrace is included in the name of another symbol - actual query rg --stats ' notrace ' -g '!tools' -q)
We could definitely use a more specific name than unpatchable to avoid polluting the global namespace, but I do think Josh is right w/regards to projects defining one of these for themselves if we don't provide one - Linux already does it in C after all!
That said, I don't think it would be a show-stopper to only provide #[patchable_function_entry()] and require someone to specify both zeros, as users could create a macro if it deeply bothered them, and it's still not that common.
I too am generally a fan of fewer names, but there are realistic downsides to doing this with a macro today. You would have to use an attribute proc macro, which is annoying to implement and makes IDE integration worse, or wrap your function definitions inside of a macro_rules! macro, which is annoying to implement and use and makes IDE integration worse.
Plan for next draft edits, in case someone thinks something's missing or wants to object:
- Define behavior if patchability is specified differently for multiple crates to be a. Functions may end up with the patchability configuration specified by any crate in the compilation graph. b. The compiler may produce an error if it detects multiple patchability configurations in the crate graph.
- Explicitly call out that for some platforms (PPC), only some values of entry/prefix may be legal, and the compiler may error.
entryrenamedentry_nops,prefixrenamedprefix_nops,nop_countrenamed tototal_nops,offsetrenamed toprefix_nops- Docs note explaining the equivalence of
prefix_nopsandtotal_nopstooffsetandnop_countingcc/clang patchablerenamed back topatchable_function_entryunpatchablemoved to future work- Docs updated to suggest
#[patchable_function_entry(entry_nops = 0, prefix_nops = 0)]to disable patchability
I've made the edits I said I'd make in the previous comment.
Notably:
- Applied the latest (hopefully final?) round of name bikeshedding.
- Punted
#[unpatchable]to future work if users of this feature turn out to need it. - Punted compiler behavior around multiple defaults in the crate graph to future work. The compiler is now allowed to throw an error or use any default specified in the graph for any function, which encompasses every behavior we considered being the right choice.
I'm happy with those changes. Thanks @maurer!
@rfcbot resolve difference between flag and attribute count meanings is confusing
What is needed to move this forward?
:bell: This is now entering its final comment period, as per the review above. :bell:
The final comment period, with a disposition to merge, as per the review above, is now complete.
As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.
This will be merged soon.
This RFC has been accepted by the teams and has now been merged. To track further progress on this, follow the tracking issue:
- https://github.com/rust-lang/rust/issues/123115
Thanks to @maurer for authoring this RFC and for pushing it forward and to all those who reviewed it and provided useful feedback.