tailcall icon indicating copy to clipboard operation
tailcall copied to clipboard

Inline module for zero-cost tail call among multiple functions (mutral recursion)

Open SichangHe opened this issue 1 year ago • 5 comments

It would be similar to https://github.com/PyO3/pyo3/pull/3815 and look like this:

#[tailcall::tailcall_module]
pub mod oddity {
    #[tailcall::multi_tailcall]
    pub fn is_even(x: u64) -> bool {
        if x > 0 {
            is_odd(x as u32 - 1)
        } else {
            true
        }
    }

    #[tailcall::multi_tailcall]
    pub fn is_odd(y: u32) -> bool {
        if y > 0 {
            is_even(y as u64 - 1)
        } else {
            false
        }
    }
}

And expand to something like:

pub mod oddity {
    enum __TailCallState {
        is_even(u64),
        is_odd(u32),
    }
    #[inline(never)]
    fn __multi_tailcall(mut tailcall_state: __TailCallState) -> bool {
        'tailcall_loop: loop {
            return match tailcall_state {
                __TailCallState::is_even(x) => {
                    if x > 0 {
                        tailcall_state = __TailCallState::is_odd(x as u32 - 1);
                        continue 'tailcall_loop;
                    } else {
                        true
                    }
                }
                __TailCallState::is_odd(y) => {
                    if y > 0 {
                        tailcall_state = __TailCallState::is_even(y as u64 - 1);
                        continue 'tailcall_loop;
                    } else {
                        false
                    }
                }
            }
        }
    }

    #[inline(never)]
    pub fn is_even(x: u64) -> bool {
        let tailcall_state = __TailCallState::is_even(x);
        __multi_tailcall(tailcall_state)
    }

    #[inline(never)]
    pub fn is_odd(y: u32) -> bool {
        let tailcall_state = __TailCallState::is_odd(y);
        __multi_tailcall(tailcall_state)
    }
}
  1. The whole inline module's AST (in this case the oddity module) would be fed into the tailcall_module proc macro.
  2. The proc macro finds all the functions ${f_i}=$[is_even, is_odd] annotated with tailcall::multi_tailcall.
  3. Validate all the $f_i$ have the same return type. Users should make another module if they want functions with a different return type.
  4. Make an enum __TailCallState representing the arguments for each function.
  5. Create an internal __multi_tailcall function that takes a __TailCallState and runs the loop, inlining each tail-call function's content with modification just like how it is done currently for single recursion.
  6. Rewrite each $f_i$ to create a __TailCallState and simply call the internal function __multi_tailcall with that state.

What do you think about this idea?

SichangHe avatar May 25 '24 04:05 SichangHe

Yes, I do think that would work. I suggested something similar here: https://github.com/alecdotninja/tailcall/issues/3#issuecomment-987297431

I need to push up my branch, but I am am privately working on a different solution based on stack allocated thunks which would not require a module-level annotation. The main problem that approach has is that since Rust does not yet support DSTs on the stack, a concervatively large size (and alignment) has to be chosen for the thunk data.

It would be interesting to benchmark both approaches. In the enum case, you are still going to allocate as much stack space as the largest variant, so I am not sure which will be better.

alecdotninja avatar May 28 '24 04:05 alecdotninja

Hey @alecdotninja, thanks for getting back.

I hope this idea above would be relatively easy to implement. Sorry for not seeing your prior similar suggestion.

Please don't worry about the stack space allocation. It is only done once for the whole recursion, and allocation/deallocation take only one instruction regardless of the size, if I understand correctly: see compiler output line 147. This allocation is not that big unless you are crazy about your function arguments, and probably lives in the L1 cache which is plenty fast.

SichangHe avatar May 28 '24 04:05 SichangHe

There is no need to be sorry, @SichangHe ! I appreciate your thought in this space.

I've created a new draft PR with my in progress work on the new Thunk-based runtime system.

If you have any ideas or thoughts on the current implementation, I would welcome any feedback there.

alecdotninja avatar May 28 '24 17:05 alecdotninja

Intuitively, #21 adds the overhead of copying the closure and it vtable pointer (16 bytes) doing vtable lookup (one memory load) every call.

SichangHe avatar May 30 '24 07:05 SichangHe

https://github.com/alecdotninja/tailcall/pull/21#issuecomment-2143696759:

It's not easy to take a functions arguments and represent them as enum variants in Rust, especially when generics or a receiver are involved (and remember all references are generic in at least their lifetime). 😵

There is an easy way to work around this, but it duplicates code: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c0f7b792bb666cff0e327fce10215d72 Just declare the enum as generic over all its variants, then inline all the function calls to create the enum using the argument tuples and let Rustc infer the types.

SichangHe avatar Jun 02 '24 09:06 SichangHe