rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

[RFC] Static Function Argument Unpacking

Open miikkas opened this issue 1 year ago • 31 comments

Summary

This RFC adds call-site unpacking of tuples, tuple structs, and fixed-size arrays, using ...expr within the function call's parentheses as a shorthand for passing arguments. The full contents of these collections with known sizes are unpacked directly as the next arguments of a function call, desugaring into the corresponding element accesses during compilation.


Rendered

miikkas avatar Oct 30 '24 19:10 miikkas

This isn't exactly varidics but it is related. Cc @Jules-Bertholet who I believe has done some design work in that area.

(Some recent discussion in that area https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/Variadic.20generics.20experiment)

tgross35 avatar Oct 30 '24 19:10 tgross35

A place where I found myself wanting something similar is when passing function pointers instead of closures:

 // Compiles
 [10].into_iter().map(usize::count_ones);
 // Does not
 std::iter::zip([10], [11]).map(usize::min);

The second iterator will error by stating that this map expects a function that takes a 2-tuple as an argument, not 2 distinct arguments. This is solved by destructuring this tuple manually: .map(|(x, y)| usize::min(x, y)), which seems unnecessary. Though the proposal currently does not do this; Conceptually, it feels like there could be a solution that addresses this papercut as well.

Victoronz avatar Oct 31 '24 03:10 Victoronz

std::iter::zip([10], [11]).map(usize::min)

when we get variadic generics, you could probably just have:

pub trait Iterator {
    // assumes variadic generics are built on tuples
    pub fn splatted_map<F: FnMut(...Self::Item) -> R, R>(self, f: F) -> SplattedMap<Self, F>
    where
        Self::Item: Tuple,
    {
        todo!()
    }
    // ... all existing trait methods
}

programmerjake avatar Oct 31 '24 04:10 programmerjake

A place where I found myself wanting something similar is when passing function pointers instead of closures:

 // Compiles
 [10].into_iter().map(usize::count_ones);
 // Does not
 std::iter::zip([10], [11]).map(usize::min);

The second iterator will error by stating that this map expects a function that takes a 2-tuple as an argument, not 2 distinct arguments. This is solved by destructuring this tuple manually: .map(|(x, y)| usize::min(x, y)), which seems unnecessary. Though the proposal currently does not do this; Conceptually, it feels like there could be a solution that addresses this papercut as well.

This papercut is definitely related, and I've collected some links related to the problem under the "Prior Art" subchapter "Using Tuples in Place of Argument Lists" (possibly the Zulip thread there was started by you?). Unfortunately, I couldn't come up with a nice design leveraging the syntax proposed here to address this.

miikkas avatar Oct 31 '24 15:10 miikkas

std::iter::zip([10], [11]).map(usize::min)

when we get variadic generics, you could probably just have:

pub trait Iterator {
    // assumes variadic generics are built on tuples
    pub fn splatted_map<F: FnMut(...Self::Item) -> R, R>(self, f: F) -> SplattedMap<Self, F>
    where
        Self::Item: Tuple,
    {
        todo!()
    }
    // ... all existing trait methods
}

Are you suggesting this as a change to map, or as a separate method? If the former, would this be a breaking change?

Victoronz avatar Oct 31 '24 17:10 Victoronz

A place where I found myself wanting something similar is when passing function pointers instead of closures:

 // Compiles
 [10].into_iter().map(usize::count_ones);
 // Does not
 std::iter::zip([10], [11]).map(usize::min);

The second iterator will error by stating that this map expects a function that takes a 2-tuple as an argument, not 2 distinct arguments. This is solved by destructuring this tuple manually: .map(|(x, y)| usize::min(x, y)), which seems unnecessary. Though the proposal currently does not do this; Conceptually, it feels like there could be a solution that addresses this papercut as well.

This papercut is definitely related, and I've collected some links related to the problem under the "Prior Art" subchapter "Using Tuples in Place of Argument Lists" (possibly the Zulip thread there was started by you?). Unfortunately, I couldn't come up with a nice design leveraging the syntax proposed here to address this.

yeah, that zulip thread was me! I didn't pursue it any further back then, but it seemed to me that adjusting the coercion of function items into closures to support this could be an option, but that would be distinct enough to be separate from this proposal I think

Victoronz avatar Oct 31 '24 18:10 Victoronz

Are you suggesting this as a change to map, or as a separate method? If the former, would this be a breaking change?

splatted_map would have to be a separate method because it would be a breaking change and actually make map much less usable, since you can currently use map with any type, but if you changed it to splat then it could only be used with splat-able types (e.g. (1..10).map(|v| v * v) would become a type error)

programmerjake avatar Oct 31 '24 18:10 programmerjake

Are you suggesting this as a change to map, or as a separate method? If the former, would this be a breaking change?

splatted_map would have to be a separate method because it would be a breaking change and actually make map much less usable, since you can currently use map with any type, but if you changed it to splat then it could only be used with splat-able types (e.g. (1..10).map(|v| v * v) would become a type error)

That would sadly not work as a fix then, since this papercut is not map specific, rather all methods that take a function parameter and can pass a tuple to that function are affected. This means not only a wide swath of iterator adapters, but also types like Option, Result, or functionality like what the tap crate offers.

Victoronz avatar Oct 31 '24 18:10 Victoronz

That would sadly not work as a fix then

I don't see why we can't add splatted variants for those functions most likely to have tuples, e.g. Iterator::map...we can just say that for less common callbacks (e.g. Cell::update) you have to manually splat using a lambda function

programmerjake avatar Oct 31 '24 18:10 programmerjake

That would sadly not work as a fix then

I don't see why we can't add splatted variants for those functions most likely to have tuples, e.g. Iterator::map...we can just say that for less common callbacks (e.g. Cell::update) you have to manually splat using a lambda function

I'd call this a workaround, not a fix, because it doesn't really address the issue itself. There are about 28 (stable) methods on iterator alone that take function parameters, which I think too many to add a new flavor for. The problematic case is when an iterator iterates over tuples, or Option/Result holds a tuple. In this case, all such methods now work with tuples, it is not that some specific methods are likely to have them.

All in all, this is a papercut, and on its own I don't think it is enough justification for adding new dedicated methods, especially when it could feasibly be addressed by the language itself/some other change later on.

Victoronz avatar Oct 31 '24 19:10 Victoronz

While it is not spelled out, I suppose unpacking ...expr to the surrounding which length is not uniquely determined is still allowed when type of expr is known, right?

Example 1 with variadic function call:

use std::ffi::c_char;

unsafe extern "C" {
    unsafe fn printf(fmt: *const c_char, ...);
}

fn main() {
    let expr = (1, 2, 3);
    unsafe {
        printf(c"%d %d %d\n".as_ptr(), ...expr);
    }
}

Example 2 with overloaded unboxed closure:

#![feature(fn_traits, unboxed_closures)]

#[derive(Copy, Clone)]
struct F;

impl FnOnce<(u8, u8)> for F {
    type Output = ();
    extern "rust-call" fn call_once(self, args: (u8, u8)) {
        println!("2 args: {args:?}");
    }
}

impl FnOnce<(u8, u8, u8)> for F {
    type Output = ();
    extern "rust-call" fn call_once(self, args: (u8, u8, u8)) {
        println!("3 args: {args:?}");
    }
}

fn main() {
    let f = F;
    f(1, 2);
    f(3, 4, 5);
    
    let expr = (6, 7);
    f(...expr);
}

Because I've seen proposal above suggesting foo(...buf.try_into().unwrap()); which does require inferring foo() can only take 5 parameters. So I'd suppose these can parse but can't type-check:

// error[E0284]: type annotations needed
printf(c"%d %d %d\n".as_ptr(), ...buf.try_into().unwrap());

// error[E0284]: type annotations needed
f(...buf.try_into().unwrap());

kennytm avatar Nov 01 '24 15:11 kennytm

While it is not spelled out, I suppose unpacking ...expr to the surrounding which length is not uniquely determined is still allowed when type of expr is known, right?

Yes, I think so. I hadn't thought of this, but it makes sense. Thanks for the examples – I'll need to do some further thinking w.r.t. calling variadic functions and update the text accordingly!

miikkas avatar Nov 02 '24 13:11 miikkas

It should be noted that, if you can unpack into a closure, you can also unpack into a tuple struct, since tuple structs implement the Fn* traits:

struct Foo(i32, i32);
Foo(...tup2)

It would also make sense to allow unpacking into tuples:

let tup2 = (1, 2);
let tup3 = (...tup2, 3);

Also note that arrow/slice patterns already have a counterpart to unpacking:

match some_slice {
    [1, 2, rest @ ..] => {}
}

So we could use the @ .. syntax for unpacking in expressions as well. EDIT: This means Expression @ .., of course.

Aloso avatar Nov 04 '24 19:11 Aloso

The rest pattern is just ..: https://doc.rust-lang.org/reference/patterns.html#rest-patterns

identifier @ .. is the rest pattern inside an identifier pattern.

@ .. on its own makes no sense.

teohhanhui avatar Nov 05 '24 04:11 teohhanhui

The rest pattern is just ..: https://doc.rust-lang.org/reference/patterns.html#rest-patterns

identifier @ .. is the rest pattern inside an identifier pattern.

@ .. on its own makes no sense.

You're being nitpicky, but I updated my comment. The idea is to use Expression @ .., to make the unpacking expression symmetric with the equivalent pattern.

.. without the binding just discards the values, like an arbirary number of _ patterns.

Aloso avatar Nov 05 '24 17:11 Aloso

I was not being nitpicky. Perhaps you could clarify what you mean by this:

So we could use the identifier @ .. syntax for unpacking in expressions as well.

teohhanhui avatar Nov 06 '24 08:11 teohhanhui

I understand this RFC focuses on function calls, however I do appreciate the shootout at the end to Functional Record Update. Specifically, a natural extension of Functional Record Update would be to allow unpacking a different struct, with public fields which happen to match, something like:

struct Large { a: i8, b: u16, c: i32, d: u64 }

struct Small { b: u16, d: u46 }

fn foo(small: Small) -> Large {
    Large { a: 8, c: 32, ...small }
}

In which case indeed the same "unpack" syntax should be used (deprecating .. for the usecase in edition 2027, for example).

matthieu-m avatar Nov 07 '24 18:11 matthieu-m

It seems odd to allow use for constructing tuple like structs, but not for anonymous tuples.

Is there any reason not to support unpacking in tuples?

tmccombs avatar Nov 09 '24 17:11 tmccombs

It seems odd to allow use for constructing tuple like structs, but not for anonymous tuples.

yes, I think we should avoid adding more ways that tuples can't be used like tuple structs, since we already have enough pain from that for macro authors (e.g. you can't write type T2<A, B> = (A, B); T2 { 0: 123, 1: "abc" } even though that works fine with the equivalent tuple struct)

programmerjake avatar Nov 10 '24 10:11 programmerjake

So we could use the @ .. syntax for unpacking in expressions as well. EDIT: This means Expression @ .., of course.

Thanks for the suggestion. Having more alternatives from Rust-specific prior art helps, and so I've listed this Table 1 in commit 857517a54447ec44e781e17560c7f74e015eabe0. It's actually a bit similar to Scala's syntax, which I added in commit e06e70a2212d6ec9e410df827e35ea21349530ac.

miikkas avatar Nov 10 '24 11:11 miikkas

It seems odd to allow use for constructing tuple like structs, but not for anonymous tuples.

Is there any reason not to support unpacking in tuples?

It seems odd to allow use for constructing tuple like structs, but not for anonymous tuples.

yes, I think we should avoid adding more ways that tuples can't be used like tuple structs, since we already have enough pain from that for macro authors (e.g. you can't write type T2<A, B> = (A, B); T2 { 0: 123, 1: "abc" } even though that works fine with the equivalent tuple struct)

Personally, I see zero technical impediments and am totally in favor of having these as well.

There's some discussion about human reasons under this comment thread: https://github.com/rust-lang/rfcs/pull/3723#pullrequestreview-2406711936

@tmccombs, @programmerjake, what's your view on these?

miikkas avatar Nov 10 '24 11:11 miikkas

Personally, I see zero technical impediments and am totally in favor of having these as well.

There's some discussion about human reasons under this comment thread: #3723 (review)

@tmccombs, @programmerjake, what's your view on these?

I think that it should work in any reasonable future implementation of variadics to be able to create a tuple or array with syntax like (...v, a, ...r) -- as long as the length of v and r are a known constant and not a const generic. const generic lengths run into problems of basically being an alternate way to write const generic expressions (e.g. f([...a, ...a]) allows you to effectively call f with type [T; N * 2] where N is a const generic) which the rust compiler doesn't have a complete implementation of due to a bunch of implementation challenges (e.g. we want to avoid needing a full SMT theorem prover in the compiler's type system).

programmerjake avatar Nov 10 '24 18:11 programmerjake

Here are a few comments on motivation, largely based on my personal experience and feeling instead of facts, but others might agree as well. Some data-driven evidence might be helpful here.

Thanks for the critical feedback! Anecdotally, I ran into the need for argument unpacking myself in a situation very similar to the one under "Guide-Level Explanation", where functions I was using were in different crates and I was merely passing the returned tuple from one as the conventional arguments in another. This is actually what prompted me to write the RFC. :)

I've added a bit of conjecture under the "Motivation" chapter in commit 31eb2293f6d8f60079d823ec8594fda517a30e05, but you're right in that data-driven evidence would be great.

Here are a couple of experiments I've had in mind that I could try to run if I have the time:

  1. Scan several large existing codebases in the languages in Table 2 – is argument unpacking actually used? How much?

  2. Implement a proof-of-concept of the following lint in a Rust compiler or clippy fork:

    • Lint: When directly unpacking arguments from an expression could be done instead of exhaustively using temporary variables that are not used elsewhere or accessing the elements/fields by hand.
      • Suggest refactor: Use unpacking instead.
    • Run it on several large existing Rust codebases to see if argument unpacking would actually be useful.

miikkas avatar Nov 11 '24 17:11 miikkas

This example does not convince me. My first thought is that the code is clearer if you define a struct to contain r, g, and b. The unpacking feature makes it more convenient to use the code as written, which is actually a net negative because it removes pressure to refactor based on a struct.

The rusty approach to APIs tends to avoid long sequences of arguments of the same type because it's super hard to read that code and understand what is happening. This encourages long sequences of arguments of the same type.

I agree that encouraging long sequences of arguments is one possible result of this feature, and this concern is actually listed as the first point under the Drawbacks chapter.

However, I'd argue that Rusty way is also relying on crates made by others, and refactoring those to exactly suit the function caller's needs is often not worth the effort. That said, I could try to come up with more convincing examples. One problem with those is that we don't yet have variadic functions which combine nicely with argument unpacking as shown upthread and also in the backlinked issue just recently.

miikkas avatar Jan 17 '25 18:01 miikkas

One problem with those is that we don't yet have variadic functions which combine nicely with argument unpacking as shown upthread and also in the backlinked issue just recently.

Variadic arguments and argument unpacking aren't exactly the same thing. In many languages that support varargs (e.g. Java and Go), argument unpacking only works on the vararg, which effectively does not "unpack" anything but just removes the syntactic sugar brought by varargs.

SOF3 avatar Feb 13 '25 10:02 SOF3

While I have no issue with the feature itself, I strongly believe the proposed syntax would be a mistake, due to its similarity to the range syntax. Rust had a ... operator before, and it was removed for being too similar. Here's some alternatives I think are more compelling:

  1. Magic macro. I think this is the most compelling as a temporary syntax, since things like ptr::addr_of and try have been done before.
  2. ~~Soft keyword. The existence of the &raw syntax shows that we can have soft keywords in value contexts. This could be unpack, similar to the lua syntax.~~ (I misremembered the &raw syntax, if we want a soft keyword it would have to be used in combination with a hard keyword)
  3. Look harder for an existing alternative syntax. While the syntax of several other languages is considered, there's a chance there's something else not considered that would work in rust.
  4. Invent something new for rust. I think a postfix [*] would be a decent choice, if you think as unpacking as a special case of list indexing where you get all of the values.

lolbinarycat avatar Mar 15 '25 19:03 lolbinarycat

@lolbinarycat, thanks! I'll elaborate on the suggestions you've given in the RFC text. In my view, especially the unpack soft keyword idea seems nice, but I don't really hold a strong opinion on which syntax we should ultimately choose.

miikkas avatar Mar 16 '25 06:03 miikkas

if you have unpack as a soft keyword, what happens with [unpack * a] since that already means something?

programmerjake avatar Mar 16 '25 07:03 programmerjake

if you have unpack as a soft keyword, what happens with [unpack * a] since that already means something?

Or [unpack (a)], for that matter.

bluebear94 avatar Mar 16 '25 14:03 bluebear94

I think raw only works as a soft keyword because it's followed by const/mut which makes it unambiguous. if we did something like [do unpack *a] then that would be unambiguous since do is a hard keyword.

programmerjake avatar Mar 16 '25 17:03 programmerjake