rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

RFC: Add additional `inline` intents

Open scottmcm opened this issue 10 months ago • 11 comments

This proposes adding #[inline(trampoline)] and #[inline(rarely)] to hint the compiler additional information about why you're marking the function for inlining, to help it hopefully make better choices without the overly-strong hammers of always or never.

Rendered

scottmcm avatar Feb 23 '25 01:02 scottmcm

I like this but am kind of mentally conflicted about the name "trampoline." I can't tell if it's a term that's already in use, or if it's something that was made up to match the metaphor.

Ultimately would be fine with the name and am not going to block this on bikeshedding, but, kind of wanted to express my thoughts in case someone else feels more strongly and isn't sure where others stand.

clarfonthey avatar Feb 23 '25 06:02 clarfonthey

IIUC after this RFC we would have these 8 inline levels (including the rustc_* ones):

Attribute LLVM function attribute MIR inliner effect
#[rustc_force_inline = "reason"] alwaysinline compile error if cannot inline
#[inline(always)] alwaysinline always inline
#[inline] inlinehint -Zinline-mir-hint-threshold=100
- - -Zinline-mir-threshold=50
#[inline(trampoline)] -? forces caller_is_inline_forwarder?
#[inline(rarely)] -? an even lower threshold?
#[rustc_no_mir_inline] - never inline
#[inline(never)] noinline never inline

kennytm avatar Feb 23 '25 07:02 kennytm

I like this but am kind of mentally conflicted about the name "trampoline." I can't tell if it's a term that's already in use, or if it's something that was made up to match the metaphor.

See https://en.wikipedia.org/wiki/Trampoline_(computing) -- it's used for lots of things. Probably the closest meaning to this one is the calling convention one, where a trampoline is a function with calling convention A that rearranges the stack/registers then calls another function with calling convention B.

If there's a better name, though, I'd be happy to switch it.


@kennytm There might be another if you count implicitly cross-crate-inline as different from #[inline].

We might be able to replace rustc_no_mir_inline with inline(rarely), though -- block it in mir inlining unless all the arguments are Operand::Const, say. That'd be enough for the two cases on my machine in library right now.

(Also, both rustc_no_mir_inlines are also marked #[inline], so I think they're actually LLVM inlinehint.)

scottmcm avatar Feb 23 '25 07:02 scottmcm

Perhaps inline should work like the diagnostic:: namespace, such that an unrecognized intent is a warning & no-op instead of an error?

Jules-Bertholet avatar Feb 23 '25 20:02 Jules-Bertholet

I feel like the In Combination section could do with a code example - I'm finding it hard to track exactly which attribute would go on what function. Also NonZero::new seems to come out of nowhere.

FHTMitchell avatar Feb 24 '25 12:02 FHTMitchell

I find myself thinking about the mention of analysis paralysis as a downside. I'd really like to understand 1) to what extent you expect to see these used commonly in codebases, 2) to what extent these work better than e.g. our existing heuristics for recognizing trampolines, 3) some kind of examples of these helping with codegen or compilation time or similar.

Or, to put this a different way: I think I would not want to approve this without first seeing the results of some kind of compiler experiment showing what would improve if we marked a bunch of functions with this, and weighing that against the very real cost of having multiple new variations of inline for people to consider.

joshtriplett avatar Feb 26 '25 19:02 joshtriplett

The current trampoline detection is extremely simple, as it's focused only on avoiding MIR bloat. It's essentially "if this is only two blocks -- the first being a call that returns to the second -- then this is a trampoline and cannot inline anything that makes that no longer true". That means that it it's not detected even in "simple" cases like "well we assert! something first", or where there's a dropped generic parameter, because that means more blocks which might increase the MIR size.

The place I think inline(trampoline) would be most useful is things like

https://github.com/rust-lang/rust/blob/ac91805f3179fc2225c60e8ccf5a1daa09d43f3d/library/core/src/slice/mod.rs#L3861-L3869

where it would be useful to encourage inlining the assert into the caller, even against the "but that might lead to MIR bloat" heuristic, rather than the normal behaviour of first trying (subject to other heuristics still) to inline ptr::swap_nonoverlapping into [T]::swap_with_slice even if that makes it too big to then inline into callers.

Similarly, https://github.com/rust-lang/rust/pull/136718 was experimenting with making slice::Iter::next be something like if T::IS_ZST { self.next_zst() } else { self.next_not_zst() }, which doesn't really work today because both next_zst and next_not_zst are both small enough that we end up inlining them, but then the combined thing is too big to inline, making it hard to optimize out that IS_ZST check. So it'd be nice to put a inline(trampoline) on that to make the ZST check nearly-certain to inline into the caller, which can then optimize it out and make it obviously good to inline the one next implementation it actually cared about. (TBH, could even try putting inline(discouraged) on the next_zst version because iterating over slices of things certain to be ZSTs is rare enough that it's probably not worth inlining that in generic MIR ever, but it would still be good for LLVM to inline it.)

Could we write a bunch of more complex heuristics as we identify more of these cases? Maybe. But it's be nice to just say what I mean directly in a bunch of them.

I think I would not want to approve this without first seeing the results of some kind of compiler experiment

I'd be happy to take the feedback in this thread as a "tentative interest; please make an experiment and we'll see".

scottmcm avatar Feb 26 '25 20:02 scottmcm

(total novice here)

It seems confusing to be offered one possible reason to encourage inlining and no others. Why is trampoline chosen as a reason that is worth distinguishing and not others? Also, I tend to think that the "trampoline cases" should be the more easy cases for the compiler to decide to inline without hints. But if that were an inline hint blessed by the language, I might feel less inclined to trust the compiler by default. More broadly, I wonder if this feature encourages folks to add inline based on unverified intuition.

Also, I kinda worry about the gray area in defining trampoline - How many lines of code leading up to the "trampoline call" is too many? Some inline hints might not neatly map to a single reason, but this feature kinda suggests that they should.

Just a note that this vaguely reminds me of #[allow(..), reason=".."].

camsteffen avatar Mar 01 '25 15:03 camsteffen

Alternative bikeshed colors for trampoline: wrapper, forwarding, delegate, delegating

jdahlstrom avatar Mar 04 '25 21:03 jdahlstrom

Alternative bikeshed colors for trampoline: wrapper, forwarding, delegate, delegating

Another trampoline bikeshed: #[inline(shallow)]?

edwloef avatar Mar 04 '25 22:03 edwloef

Also, I tend to think that the "trampoline cases" should be the more easy cases for the compiler to decide to inline without hints.

The trick with the trampoline cases is not inlining callees inside the trampoline that might otherwise have been inlined. Not unlike marking the callee as #[cold], but helpful for things that aren't exactly cold.

scottmcm avatar Nov 02 '25 09:11 scottmcm