rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

Explicit Tail Calls

Open phi-go opened this issue 2 years ago • 231 comments

This RFC proposes a feature to provide a guarantee that function calls are tail-call eliminated via the become keyword. If this guarantee can not be provided an error is generated instead.

Rendered

For reference, previous RFCs https://github.com/rust-lang/rfcs/pull/81 and https://github.com/rust-lang/rfcs/pull/1888, as well as an earlier issue https://github.com/rust-lang/rfcs/issues/271, and the currently active issue https://github.com/rust-lang/rfcs/issues/2691.

Rust tracking issue: https://github.com/rust-lang/rust/issues/112788

phi-go avatar Apr 06 '23 12:04 phi-go

thanks a ton @phi-go for all the work you put into writing the RFC so far! Really appreciated! 🎉

Robbepop avatar Apr 06 '23 15:04 Robbepop

While I personally know the abbreviation TCO, I think that it would be helpful to expand the acronym in the issue title for folks who might not know it at first glance.

clarfonthey avatar Apr 06 '23 18:04 clarfonthey

An alternative is to mark a function (like in OCaml), not a return keyword:

recursive fn x() {
    let a = Box::new(());
    let b = Box::new(());
    y(a);
}

VitWW avatar Apr 06 '23 21:04 VitWW

An alternative is to mark a function (like in OCaml), not a return keyword:

recursive fn x() {
    let a = Box::new(());
    let b = Box::new(());
    y(a);
}

The RFC already highlights an alternative design with markers on function declarations and states that tail calls are a property of the function call and not a property of a function declaration since there are use cases where the same function is used in a normal call and a tail call.

Robbepop avatar Apr 06 '23 21:04 Robbepop

Note: this may be suitable either as a comment on #2691 or here. I'm assuming interested parties are watching both anyway.

The restriction on caller and callee having the exact same signature sounds quite restrictive in practice. Comparing it with [Pre-RFC] Safe goto with value (which does a similar thing to the become keyword but for intra-function control flow instead of across functions), the reason that proposal doesn't have any requirements on labels having the same arguments is because all parameters in all labels are part of the same call frame, and when a local is used by different label than the current one it is still there in memory, just uninitialized.

If we translate it to the become approach, that basically means that each function involved in a mutual recursion would (internally) take the union of all the parameters of all of the functions, and any parameters not used by the current function would just be uninitialized. There are two limitations to this approach:

  • It changes the calling convention of the function (since you have to pad out the arguments with these uninitialized locals)
  • The calling convention depends on what other functions are called in the body

I don't think this should be handled by an actual calling convention label though. Calling these "rusttail" functions for discussion purposes, we have the following limitations:

  • You can't get a function pointer with "rusttail" convention
  • You can't use a "rusttail" function cross-crate

The user doesn't have to see either of these limitations directly though; the compiler can just generate a shim with the usual (or specified) calling convention which calls the "rusttail" function if required. Instead, they just observe the following restrictions, which are strictly weaker than the one in the RFC:

  • If you become a function in another crate, the arguments of caller and callee have to match exactly
  • If you only become a function in the same crate, there are no restrictions
  • (If f tail-calls a cross-crate function g and also an internal function h, then f,g,h must all have the same arguments)

digama0 avatar Apr 06 '23 22:04 digama0

digama0 Your assesment (https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1499677722) is correct for the case that you tail call a known set of functions. However, for the common case of indirect tail calls this does not work since it is impossible/impractical to know upfront which functions are going to be called indirectly.

The RFC mentions another alternative design with function groups with which it is possible to model such a scenario by restricting the group of tail callable functions. However, this alternative design has the major flaw in that the property of tail calls is not a property of a function declaration but the property of a function call.

This is kind of like semantical differentiation between your "Safe goto with value" RFC and tail calls similar to polymorphism and enum dispatch.

Robbepop avatar Apr 07 '23 08:04 Robbepop

To add to the response by @Robbepop. There is also the possibility of padding out functions using something like macros as outlined here: https://github.com/phi-go/rfcs/blob/guaranteed-tco/text/0000-guaranteed-tco.md#helpers. Admittedly, this is not as ergonomic but would be more manageable to implement on the compiler side.

phi-go avatar Apr 07 '23 08:04 phi-go

To add to the response by @Robbepop. There is also the possibility of padding out functions using something like macros as outlined here: https://github.com/phi-go/rfcs/blob/guaranteed-tco/text/0000-guaranteed-tco.md#helpers. Admittedly, this is not as ergonomic but would be more manageable to implement on the compiler side.

I am sure that this could be done by the compiler for known tail calls as future work to extend Rust tail calls once the MVP is implemented. No macros needed. The problem with this though is that we always need to view the transitive chain of tail calls concerning this relaxation.

Robbepop avatar Apr 07 '23 08:04 Robbepop

@digama0

  • You can't get a function pointer with "rusttail" convention
  • You can't use a "rusttail" function cross-crate

This feels unnecessary restricting. For example the Parsing Protobuf at 2+GB/s: How I Learned To Love Tail Calls in C article mentioned in the RFC uses a table of function pointers in some examples:

MUSTTAIL return op_table[op](ARGS);

This seems like a very useful pattern and disallowing it with the convention specifically designed for tail-calls seems very unfortunate.

WaffleLapkin avatar Apr 07 '23 13:04 WaffleLapkin

part of how a fast interpreter is likely implemented is to specifically use function pointers:

#[derive(Copy, Clone)]
#[repr(transparent)]
struct ProgramPos<'a> {
    ptr: *const Instruction<'a>,
    _phantom: PhantomData<&'a [Instruction<'a>]>,
}

impl<'a> ProgramPos<'a> {
    fn get(self) -> Instruction<'a> {
        unsafe { *self.ptr }
    }
    // static analysis ensures pc stays valid
    unsafe fn next(mut self) -> Self {
        self.ptr = self.ptr.add(1);
        self
    }
}

struct StackRef<'a> {
    ptr: *mut u64,
    pub top_value: u64,
    _phantom: PhantomData<&'a mut [u64]>,
}

impl StackRef<'_> {
    // static analysis ensures stack doesn't overflow/underflow
    unsafe fn push(&mut self, v: u64) {
        self.ptr = self.ptr.add(1);
        *self.ptr = self.top_value;
        self.top_value = v;
    }
    unsafe fn pop(&mut self) -> u64 {
        let retval = self.top_value;
        self.top_value = *self.ptr;
        self.ptr = self.ptr.sub(1);
        retval
    }
}

type Func = unsafe fn(
    pc: ProgramPos<'_>,
    env: &mut Env,
    stack: StackRef<'_>,
) -> ExitReason;

union Arg<'a> {
    branch_target: ProgramPos<'a>,
    int: u64,
}

struct Instruction<'a> {
    func: Func,
    arg: Arg<'a>,
}

unsafe fn int_add(
    pc: ProgramPos<'_>,
    env: &mut Env,
    mut stack: StackRef<'_>,
) -> ExitReason {
    let v = stack.pop();
    stack.top_value = v.wrapping_add(stack.top_value);
    let pc = pc.next();
    let f = pc.get().func;
    become f(pc, env, stack);
}

unsafe fn push_const(
    pc: ProgramPos<'_>,
    env: &mut Env,
    mut stack: StackRef<'_>,
) -> ExitReason {
    stack.push(pc.get().arg.int);
    let pc = pc.next();
    let f = pc.get().func;
    become f(pc, env, stack);
}

unsafe fn branch_if(
    pc: ProgramPos<'_>,
    env: &mut Env,
    mut stack: StackRef<'_>,
) -> ExitReason {
    let pc = if stack.pop() != 0 {
        pc.get().arg.branch_target;
    } else {
        pc.next()
    };
    let f = pc.get().func;
    become f(pc, env, stack);
}

// more instructions...

programmerjake avatar Apr 07 '23 20:04 programmerjake

@digama0

  • You can't get a function pointer with "rusttail" convention
  • You can't use a "rusttail" function cross-crate

This feels unnecessary restricting. For example the Parsing Protobuf at 2+GB/s: How I Learned To Love Tail Calls in C article mentioned in the RFC uses a table of function pointers in some examples:

MUSTTAIL return op_table[op](ARGS);

This seems like a very useful pattern and disallowing it with the convention specifically designed for tail-calls seems very unfortunate.

It's not disallowed. The "rusttail" convention is not exposed to user code at all, the compiler transparently decides how to use functions with that calling convention but one constraint is that "rusttail" functions can't be used in function pointers. You can become both "rusttail" functions and regular functions, but regular functions have the same-arguments restriction while rusttail functions have no argument restrictions. So your op_table example would be fine, those would be regular functions in the table and the caller would have to have the same arguments as ARGS.

digama0 Your assesment (#3407 (comment)) is correct for the case that you tail call a known set of functions. However, for the common case of indirect tail calls this does not work since it is impossible/impractical to know upfront which functions are going to be called indirectly.

Indirect tail calls would be regular functions, and hence would need to obey the same argument restriction.

The RFC mentions another alternative design with function groups with which it is possible to model such a scenario by restricting the group of tail callable functions. However, this alternative design has the major flaw in that the property of tail calls is not a property of a function declaration but the property of a function call.

My proposal is basically a hybrid between this alternative and the one in the RFC, which takes the least restrictive aspects of both strategies. So the user only gets an error if they simultaneously violate the same-argument restriction and the same-crate, no pointers restriction - if they use pointers or cross-crate calls then this implies regular calling convention (an unknown set of callers), and so there is no wiggle room for the compiler to work around the the same-argument restriction.

This is kind of like semantical differentiation between your "Safe goto with value" RFC and tail calls similar to polymorphism and enum dispatch.

I agree. My main point is that the compiler is able to handle this particular situation in a completely transparent way - the only user-visible effect is a lifting of the restrictions in some situations. Syntactically the only thing you have to do is use become in appropriate places, just as in the RFC version.

@programmerjake If you are using function pointers like that then you will already need to satisfy the same-argument restriction, simply because all the function pointers in the table need to have the same type. The case I was thinking about where you don't satisfy the same argument restriction is for state machines where the jumps are to statically known functions with different arguments specific to the operation performed. My proposal is for the compiler to consider both options (using "rusttail" functions or regular functions) and accept the code if either one works, since each approach has different restrictions.

digama0 avatar Apr 07 '23 20:04 digama0

working example of fn ptr based interpreter: https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=fad6c3f2c35131012cc811ccdaab5228

programmerjake avatar Apr 07 '23 21:04 programmerjake

I recommend that you call this "Proper Tail Calls" and not "Guaranteed TCO".

  • The term "proper tail call" (PTC) is a standard tech term for the feature.
  • The purpose of PTC is to generate a jump to another function, instead of pushing a new stack frame and calling it. This ensures that a chain of tail calls consumes constant stack space. The constant stack space guarantee is the purpose of this feature. The purpose is not "optimization".
  • PTC is not always an optimization. In some cases, the code for jumping to a function can be more expensive than creating a new stack frame, because setting up new arguments on the stack while clearing out old stack arguments might be more expensive than just pushing the new arguments. There might be extra copying involved.
  • Using the word "optimization" may cause confusion. Compiler optimizations are distinct from language features that provide semantic guarantees, which is what PTC is (a guaranteed bound on space consumption). The fact that PTC is sometimes slower may cause confusion. People may point out that your "guaranteed optimization" is not always an optimization.

doug-moen avatar Apr 09 '23 03:04 doug-moen

@doug-moen the feature name is also discussed in this thread. I'll quote your comment there so that everything is in one place.

phi-go avatar Apr 09 '23 06:04 phi-go

@digama0 thank you for clarifying (https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1500620309). I think this would be great to have from a usability aspect. As the implementation does not sound trivial I'll add this suggestion to the open questions.

phi-go avatar Apr 11 '23 09:04 phi-go

other stuff that needs to be changed to TCE: PR title, RFC file's name, feature name

programmerjake avatar Apr 12 '23 06:04 programmerjake

@programmerjake you are right! I was waiting for some feedback on the feature name (https://github.com/rust-lang/rfcs/pull/3407#discussion_r1162478235). Though, I take it your vote is in favor of TCE as the feature name.

phi-go avatar Apr 12 '23 07:04 phi-go

Though, I take it your vote is in favor of TCE as the feature name.

tbh I don't particularly care as long as it's not named <something> optimization, since it's not an optimization -- the language guarantees that tons of tail calls won't cause a stack overflow if become is used.

programmerjake avatar Apr 12 '23 09:04 programmerjake

Though, I take it your vote is in favor of TCE as the feature name.

tbh I don't particularly care as long as it's not named optimization, since it's not an optimization -- the language guarantees that tons of tail calls won't cause a stack overflow if become is used.

Fair enough!

phi-go avatar Apr 12 '23 11:04 phi-go

It seems the feature name will come down to popularity, if anyone wants you can vote here: https://github.com/rust-lang/rfcs/pull/3407#discussion_r1166645590.

phi-go avatar Apr 14 '23 10:04 phi-go

I suggest that, since this feature is niche, that the #[become] (or similar) annotation alternative be used instead, since it the extra characters are unlikely to be typed by a large number of users, and this then keeps become free for future use.

Also, the annotation allow for a more descriptive name rather than become, e.g. #[tail_call].

Fishrock123 avatar Apr 15 '23 03:04 Fishrock123

@Fishrock123 From what I remember, become is reserved specifically because it was always intended to be for this use.

Most of the early Rust documents I ran across were before I started diligently bookmarking things, but Rust's design is a history of starting out wanting more or less every feature you'd find in something like Haskell and then figuring out how much can be attained in a coherent and sensible way without things like currying or garbage collection.

Unless I'm mis-remembering, one of the previous attempts stalled out because LLVM's support for it wasn't good enough yet.

ssokolow avatar Apr 15 '23 03:04 ssokolow

@Fishrock123 From what I remember, become is reserved specifically because it was always intended to be for this use.

Yes, become is reserved exactly for this use case, see here https://rust-lang.github.io/rfcs/0601-replace-be-with-become.html.

Unless I'm mis-remembering, one of the previous attempts stalled out because LLVM's support for it wasn't good enough yet.

The previous attempts were deemed too much effort at the time, LLVM support was one reason. You can see this of the previous RFC section (2017) for a bit more information. Support improved a lot since then, which is why I hope this attempt will go through.

phi-go avatar Apr 15 '23 09:04 phi-go

According to my count all raised concerns should be addressed now.

phi-go avatar Apr 20 '23 12:04 phi-go

About concerns of "Alternating become and return calls" and "Operators are not supported". I propose, that become keyword works without warnings only with functions, which are marked with #[recursive]/#[must_become] or #[may_become] attribute. And #[must_become] functions must use at least one become. #[may_become] functions has no becomes.

#[must_become] functions are self-recursive or cross-recursive functions. #[may_become] for chain-calls functions, as Nil for List : b1(b2(b3(..bN( mb_return(data) )))) Unsafe code must ignore these attributes.

this additional attribute solves automatically "Operators are not supported" and "forgotten" become.

// ****** Self-recursive ******

#[must_become]
pub fn fibonacci(n: u64) -> u64 {
    if n < 2 {
        return n
    }
    become fibonacci(n - 2) + fibonacci(n - 1)
    // ^ ~~ warning: '+' is not '#[must/may_become]'
}

// ****** Cross-recursive ******

#[must_become]
fn foo(n: i32) {
    become bar(n);
    // ^ ~~ warning: 'bar' is not '#[must/may_become]'
}

// oops, we forgot '#[must_become]' 
fn bar(n: i32) {
    // oops, we forgot 'become'
    return foo(n);
   // It is Ok to not-recursive function 'return' recursive one
}

#[must_become]
fn baz(n: i32) {
    // oops, we forgot 'become'
    return foo(n);
   // ^ ~~ warning: 'baz' has no any 'become'
}

// oops, we forgot '#[must_become]' 
fn baw(n: i32) {
    become foo(n);
   // ^ ~~ warning: 'baw' is not '#[must/may_become]'
}

// ****** Chain-calls ******

#[must_become]
fn bas1(n: i32) {
    become bas2(n);
}

#[must_become]
fn bas2(n: i32) {
    become bas3(n);
}

#[must_become]
fn bas3(n: i32) {
    become basMB(n);
}

#[may_become]
fn basMB(n: i32) {
    n
}

EDIT: For those, who votes down, could you either

  1. write at least 1 counter-example: when #[recursive] is needed, but become not
  2. write at least 1 counter-example: when become is needed, but #[recursive] not
  3. write how without #[recursive] attribute understand if using become is wrong (except operators)

UPDATE: I forgot about chain-calls functions. Then function should be marked by either #[must_become] or #[may_become]

VitWW avatar Apr 20 '23 13:04 VitWW

I can think of scenarios in numeric computing where non-recursive functions would benefit from tail call optimization to avoid copying matrices between different function stacks. Limiting become to recursive functions only seems overly restrictive.

benkay86 avatar Apr 20 '23 13:04 benkay86

This comment shows an example of using become without recursion in an interpreter that uses tail calls to prevent overflowing the stack.

wackbyte avatar Apr 20 '23 13:04 wackbyte

About concerns of "Alternating become and return calls" and "Operators are not supported". I propose, that become keyword works only with functions, which are marked with #[recursive] attribute. And #[recursive] functions must use at least one become. #[recursive] functions are not necessary self-recursive. They could be cross-recursive.

this additional attribute solves automatically "Operators are not supported" and "forgotten" become.

#[recursive]
fn foo(n: i32) {
    become bar(n);
    // ^ ~~ error: 'bar' is not '#recursive'
}

// oops, we forgot '#[recursive]' 
fn bar(n: i32) {
    // oops, we forgot 'become'
    return foo(n);
   // It is Ok to not-recursive function 'return' recursive one
}

#[recursive]
fn baz(n: i32) {
    // oops, we forgot 'become'
    return foo(n);
   // ^ ~~ error: 'baz' has no any 'become'
}

// oops, we forgot '#[recursive]' 
fn baw(n: i32) {
    become foo(n);
   // ^ ~~ error: 'baw' is not '#recursive'
}

#[recursive]
pub fn fibonacci(n: u64) -> u64 {
    if n < 2 {
        return n
    }
    become fibonacci(n - 2) + fibonacci(n - 1)
    // ^ ~~ error: '+' is not '#recursive'
}

EDIT: For those, who votes down, could you either

1. write at least 1 counter-example: when `#[recursive]` is needed, but `become ` not

2. write at least 1 counter-example: when `become` is needed, but `#[recursive] ` not

3. write how without `#[recursive] ` attribute understand if using `become` is wrong (except operators)

I downvoted because I don't see how the #[recursive] attribute solves any problems that cannot be solved with what has already been proposed. Maybe it would add some layer of transparency at the huge cost of having to attribute all functions while tail calls are a property of the call and not of functions.

Furthermore this would seriously impact tail calling closures that would otherwise work out of the box. So we'd need special rules for them to work without attributes etc.

And as @benkay86 already mentioned we already show-cased many examples (e.g. interpreters) that are not inherently recursive but still rely on tail calls for their codegen.

Robbepop avatar Apr 20 '23 13:04 Robbepop

Maybe it would add some layer of transparency at the huge cost of having to attribute all functions while tail calls are a property of the call and not of functions.

This attribute protect safe code from footguns and bugs.

Furthermore this would seriously impact tail calling closures that would otherwise work out of the box. So we'd need special rules for them to work without attributes etc.

Agree, closures are exceptions. Anyway unsafe code must ignore this attribute.

VitWW avatar Apr 20 '23 14:04 VitWW

Nothing about this feature is unsafe or even meaningfully interacts with unsafe parts of the language. What footguns are you referring to? The worst that can happen if you mess something up here is a compile error.

digama0 avatar Apr 20 '23 14:04 digama0