rust icon indicating copy to clipboard operation
rust copied to clipboard

Tracking Issue for RFC 3086: macro metavariable expressions

Open nikomatsakis opened this issue 4 years ago • 29 comments

This is a tracking issue for the RFC "3086" (rust-lang/rfcs#3086). The feature gate for the issue is #![feature(macro_metavar_expr)].

About tracking issues

Tracking issues are used to record the overall progress of implementation. They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions. A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature. Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

Unresolved questions and bugs

Implementation history

nikomatsakis avatar Mar 26 '21 19:03 nikomatsakis

There is a working prototype on the markbt/rust/metavariable_expressions branch. This needs feature gating, and there are a couple of TODOs to resolve, but it's otherwise in reasonable shape. I'm planning to work on it over the coming weeks.

markbt avatar Mar 27 '21 12:03 markbt

Update (2021-04-07)

I've not yet started work on productionizing the prototype on the markbt/rust/metavariable_expressions branch. I plan to start later this month, free time permitting.

markbt avatar Apr 07 '21 08:04 markbt

Update (2021-05-09)

I still haven't started on this yet as some stuff came up last month that prevented from having the time to work on it. It's still in my plan to work on it, and hopefully I'll have some time soon.

markbt avatar May 09 '21 09:05 markbt

Update (2021-06-29)

Still no progress, as I haven't had any spare time to work on this project. I'm still planning to work on it, and hopefully will get some time soon.

markbt avatar Jun 29 '21 08:06 markbt

@markbt If you don't expect to find the bandwidth in the near future, would you potentially be interested in seeking help in the form of another owner for this initiative? If you're still interested in driving this, that's fine, but if you'd like us to switch gears from pinging you to broadcasting that the project could use help, we'd be happy to do that.

joshtriplett avatar Jul 07 '21 17:07 joshtriplett

I'd be happy with any help if there's someone available. I still plan to work on it, but personal stuff is getting in the way at the moment. Sorry about that.

To recap: I have a working prototype on my branch at https://github.com/markbt/rust/tree/metavariable_expressions . The next steps are to rebase that onto the latest master, and then polish it up so that it's ready for inclusion. Then there's also the doc work to make sure the new feature is documented well. Help with any of this would be appreciated.

markbt avatar Jul 09 '21 08:07 markbt

@markbt Are you still working on this feature? If not, then I can pick up from where you stopped

c410-f3r avatar Jan 29 '22 11:01 c410-f3r

I haven't had a chance to work on it for a while. I'm still interested in it, so happy to help out if you're picking it up, it just had to take a back burner relative to some personal things.I have a branch with a prototype implementation on my github fork. It's likely very out of date, so will need rebasing up to something more recent. Or perhaps you can just use it for inspiration and start from scratch. Let me know if I can help at all, although I don't have a lot of free time at the moment.

markbt avatar Jan 29 '22 16:01 markbt

Thank you @markbt

I will take a look at the branch and then get in touch if any questions arise

c410-f3r avatar Jan 30 '22 00:01 c410-f3r

The implementation has been merged so please try testing the feature as much as possible to find any potential bugs

c410-f3r avatar Mar 15 '22 12:03 c410-f3r

Thanks @c410-f3r for all your hard work!

From my limited experience with this feature, I have some concerns about the design:

  • The depth arguments are confusing because some of the functions count up from the most nested depth and some count down from the outermost level.
  • Also, indexing does not universally start at 0, and an index of 1 means different things for different meta-variable expressions.
  • I find the semantics of count confusing and hard to keep track of. Which level is it counting? Does depth 1 mean count everything or count the 2nd nested loop? Does depth 1 mean count everything or count only the outermost loop? Maybe a name like total_count would be clearer? Maybe the depth option should not be optional? Maybe count should be removed, and people can just sum the output of length?

mark-i-m avatar Mar 15 '22 14:03 mark-i-m

Thank you @mark-i-m for reviewing https://github.com/rust-lang/rust/pull/93545 and for all your contributions to the compiler.

Hehehehe... You, me and @petrochenkov had trouble trying to understand this feature

  • The depth arguments are confusing because some of the functions count up from the most nested depth and some count down from the outermost level.

Indeed! It is much easier cognitively to keep everything with the same direction instead of having count going from outer-to-inner and index/length from inner-to-outer. Although I am not sure if semantic would be impacted.

  • Also, indexing does not universally start at 0, and an index of 1 means different things for different meta-variable expressions.
  • I find the semantics of count confusing and hard to keep track of. Which level is it counting? Does depth 1 mean count everything or count the 2nd nested loop? Does depth 1 mean count everything or count only the outermost loop? Maybe a name like total_count would be clearer? Maybe the depth option should not be optional? Maybe count should be removed, and people can just sum the output of length?

Yeap, someone that wrote ${count(foo)} will probably expect that ${count(foo, 1)} will return some value related to the amount of foos instead of the actual number of outer repetitions.

If repetitions are nested, then an optional depth parameter can be used to limit the number of nested repetitions that are counted. For example, a macro expansion like:

${count(x, 1)} ${count(x, 2)} ${count(x, 3)} $( a $( b $( $x )* )* )*

The three values this expands to are the number of outer-most repetitions (the number of times a would be generated), the sum of the number of middle repetitions (the number of times b would be generated), and the total number of repetitions of $x.

https://github.com/markbt/rfcs/blob/macro_metavar_expr/text/0000-macro-metavar-expr.md#count

And as you said, throwing nested loops into the equation will alter indexing making understanding even harder. Not to mention mixing other meta-variable expressions like length 🙁.

#![feature(macro_metavar_expr)]

fn main() {
    macro_rules! mac {
        ( $( [ $( $i:ident )* ] )* ) => {{
            // ***** No loop *****
            
            println!("{}", ${count(i)}); // 5
            println!("{}", ${count(i, 0)}); // 2
            
            // Same as ${count(i)}
            //println!("{}", ${count(i, 1)});
            
            // Fobirdden. Index out of bounds
            //println!("{}", ${count(i, 2)});
            
            // ***** Outer-most loop *****
            
            $(
                println!("{}", ${count(i)}); // 3 and 2
                
                // Same as ${count(i)}
                //println!("{}", ${count(i, 0)});
                
                // Fobirdden. Index out of bounds
                //println!("{}", ${count(i, 1)});
            )*

            // ***** Outer-most and inner-most loops *****
            
            $(
                $(
                    ${ignore(i)}

                    // Forbidden. Can't be placed inside the inner-most repetition
                    //println!("{}", ${count(i)});
                )*
            )*
        }};
    }
    
    mac!([a b c] [d e]);
}

Maybe total_count and a mandatory depth can be nice modifications but I am also not sure about the removal of count (Useful for nested stuff). Overall, I think that users will have a hard time even with a good set of documentation.

As for myself, I learned to accept the things as they are currently defined in RFC, hehehehe 😁

c410-f3r avatar Mar 15 '22 22:03 c410-f3r

Any thoughts @markbt?

c410-f3r avatar Mar 15 '22 22:03 c410-f3r

  • The depth arguments are confusing because some of the functions count up from the most nested depth and some count down from the outermost level.

+1. Being inconsistent here seems out of the question. I would vote for "distance from innermost", agreeing with the rationale in the RFC:

The meaning of the depth parameter in index and count originally counted inwards from the outer-most nesting. This was changed to count outwards from the inner-most nesting so that expressions can be copied to a different nesting depth without needing to change them.

Adding to this, having the depth parameters represent "distance from outermost" will lead to the following (unfortunate) user story:

I have a macro build_foo!. I want to refactor it to build_many_foos!, so I wrap the macro definition with $(...),*. Now all of the depth parameters are off by one so I increment them.

The only bad thing is that the English meaning of "depth" lends to "distance from outermost". But this is less important IMO.

We could support both by having negative numbers (e.g. ${index(-1)}) work as "distance from innermost", but that is probably not necessary. You'd also have to decide what to do with 0.


Maybe count should be removed, and people can just sum the output of length?

~Yes this might be a "less is more" scenario. You could potentially get what you need using expressions of index and length, like ${length(1)+index(2)} or ${length(1)*length(2)}, but we'd have to support math within ${..}.~ Edit: this doesn't work

Position Total
Current repetition index length
All repetitions count

Has it been considered to make theses values bind-able within the macro pattern? This would take depth out of the picture completely since the variables are syntactically attached to a repetition. Inspired by ngFor in Angular.

macro_rules! mac {
    ($[i=index, l=length]($word:ident),*) => {
        $(println!("{}/{}: {}", $i, $l, stringify!($word));)*
    };
}

camsteffen avatar Mar 16 '22 14:03 camsteffen

It also seems odd to me that index doesn't support an ident parameter but all the other functions do.

camsteffen avatar Mar 16 '22 14:03 camsteffen

First off, thanks to @c410-f3r for implementing this. Really excited to see it merged, and I'm going to try to find some time to try it out soon.

Thanks for all the feedback, too. I'm going to try to explain my thinking about this, as I spent some time thinking through these points when I was drafting the RFC. This was quite some time ago, so my memory is a little rusty (excuse the pun). Overall I think the RFC as it is currently is still the best way to do it, but perhaps we need to document it better to make it clearer how things work (especially count vs length).

It also seems odd to me that index doesn't support an ident parameter but all the other functions do.

Both index and length do not take ident parameters. This is because macro repetitions happen in lockstep, and so there is only one index and count for the current repetition, no matter which ident is involved. Contrast this with count where it does matter what ident it involved, as different idents can repeat different amounts in nested repetitions.

I find the semantics of count confusing and hard to keep track of. Which level is it counting? Does depth 1 mean count everything or count the 2nd nested loop? Does depth 1 mean count everything or count only the outermost loop? Maybe a name like total_count would be clearer? Maybe the depth option should not be optional? Maybe count should be removed, and people can just sum the output of length?

count was pretty much the reason I started this RFC in the first place. Part of the reason for metavar expressions is to avoid constructing large constant expressions that the compiler has to fold down, as these have a performance impacts and limitations with large numbers of repetitions which can be entirely avoided as the compiler already knows the answer.

It may be we need to improve documentation, as teaching things can sometimes be difficult. count can be thought of as "count the number of times this ident will appear". For most people there will be no need to use a depth parameter - the default of counting all depths will be what they want. The "limit to a particular number of depths" is likely only needed for special cases, when it will be worth learning the details of how it works. If we want to ensure values start at 0, I think ${count(indent, 0)} could be legitimately made to always evaluate to 1: i.e. how often does it repeat with no repetition, which is vacuously once, and shift everything down by one. We would need to distinguish "not provided" from "0" (perhaps effectively Option<usize>, where None means "all depths", although syntactically I think we would want to avoid having to write Some(2)).

We could support both by having negative numbers (e.g. ${index(-1)}) work as "distance from innermost", but that is probably not necessary. You'd also have to decide what to do with 0.

Negative numbers could work if that makes it more understandable. 0 would still mean "this depth", which is the default. In this case positive numbers >=1 would be meaningless, so the minus sign effectively becomes mandatory. Maybe another way would be to name it something different to make it clear it goes the other way (height, surround, level, ...).

The depth arguments are confusing because some of the functions count up from the most nested depth and some count down from the outermost level.

They all count from the current repetition level. count counts inwards, as it is concerned with the number of times the ident repeats within the repetition we're currently inside of. The default is "all of them", as that's the most common case. index and length count outwards as they are concerned with the place of the repetition we are currently inside of relative to the other repetitions it is itself part of. The default is the inner-most one, as, again, it's the most common case. I think this confusion perhaps lends some weight to using a different name for the depth parameters of index and length, but I don't have a good suggestion to hand.

Has it been considered to make theses values bind-able within the macro pattern?

That's really neat, but I do find it a little harder to understand (it's a bit un-Rust-lish). I'm also not sure how it would solve the count case or the ignore case. It's also quite a different design so we'd presumably need to spin through the RFC process again.

Yeap, someone that wrote ${count(foo)} will probably expect that ${count(foo, 1)} will return some value related to the amount of foos instead of the actual number of outer repetitions.

It is related to the number of foos - specifically it's the number of next-level repetitions that include foo, which might be different to the number of repetitions that include bar in the case of expressions like $( [ $( $( $foo ),* );* / $( $bar ),* ] )

markbt avatar Mar 16 '22 21:03 markbt

Okay I now understand how count is categorically different from index and length. count is for inner repetitions and index and length are for outer repetitions, relative to the current expansion context. An ident is necessary for specifying an inner repetition since there could be multiple (and multiple within those etc.). But outer repetitions only need a relative depth. With this distinction in mind, it makes sense to me now that depth can point in two directions. (sorry for lagging) I would either rename depth to distance, or use a different word for "going outward" cases like level.

With regard to my alternate syntax idea, that would only work for the "outward looking" functions index and length. To that end, I am even more convinced that this would be good. It resolves the discrepancy with depth so that it now has its intuitive meaning. Having a different syntax for inward and outward looking values lends to a not-conflated mental model. To be clear, I am suggesting to keep count as is, but replace index and length as described earlier.

Another discrepancy I found in my understanding is that I assumed that count(x) would give the number of occurrences of $x in the macro input, but rather it is the number of times $x is expanded? This is usually but not always the same, and I think input count would be generally more useful and less surprising.

In any case, pointing out the inward/outward distinction upfront as it applies to the functions would resolve confusion. I think I assumed that all functions can be used in both directions (which doesn't make sense now).

camsteffen avatar Mar 17 '22 04:03 camsteffen

For count, depth could be "outward distance from $x" rather than "inward distance from count(..)". I think that would be more consistent and intuitive.


Hope you don't mind one more alternate idea. Named repetitions. This allows for referencing repetitions that do not otherwise have an ident.

macro_rules! mac {
    ($h(hello)*) => {
        println!("hello {} times", ${count(h)});
    };
}
mac!(hello hello hello);

camsteffen avatar Mar 17 '22 06:03 camsteffen

A little pseudo illustration. Hope it helps

meta

// Another example

#![feature(macro_metavar_expr)]

macro_rules! mac {
    ( $( [ $( ( $($i:ident)* ) )* ] )* ) => {
        [
            // ****** 6 `ident` repetitions *****
            //
            // 6 (a, b, c, d, e f)
            ${count(i)},
            
            // ****** 3 `[...]` repetitions *****
            //
            // 2 (a, b)
            // 4 (c d e f)
            // 0
            $( ${count(i)}, )*
            
            // ****** 5 `(...)` repetitions *****
            //
            // 2 (a, b)
            // 0
            // 1 (c)
            // 0
            // 3 (d e f)
            $( $( ${count(i)}, )* )*
        ]
    }
}

fn main() {
    let array = mac!([(a b) ()] [(c) () (d e f)] []);
    dbg!(array);
}

c410-f3r avatar Mar 17 '22 12:03 c410-f3r

Issue: literal tokens emitted by meta-variable expression need to have correct hygiene (SyntaxContext).

For that the span assigned to them needs to be processed by Marker (see fn transcribe for examples).

petrochenkov avatar Mar 21 '22 05:03 petrochenkov

@petrochenkov that sounds like a blocker for stabilization, right? (If so, I'll add to the OP)

nikomatsakis avatar Mar 21 '22 21:03 nikomatsakis

@nikomatsakis Yes, it is a blocker.

mark-i-m avatar Mar 22 '22 01:03 mark-i-m

If you guys don't mind, I would like to at least try stabilizing the only two things that weren't part of a discussion: ${ignore} and $$. They open a range of new possibilities, don't have the inner/outer dilemma and are IMO as useful as any counting method.

c410-f3r avatar Apr 01 '22 14:04 c410-f3r

A stabilization attempt is available at https://github.com/rust-lang/rust/pull/95860

c410-f3r avatar Apr 09 '22 18:04 c410-f3r

$$crate may be surprising: #99035

CAD97 avatar Jul 08 '22 01:07 CAD97

Labeling as design concerns due to various discussions around semantics of $$ (e.g. $$crate), as well as questions about the bits not yet stabilized.

joshtriplett avatar Jul 20 '22 17:07 joshtriplett

when is this stabilized? it's so important

cybersoulK avatar Oct 15 '22 23:10 cybersoulK

I will probably review the concerns and try stabilization again in the next months

c410-f3r avatar Oct 15 '22 23:10 c410-f3r

The main blocking issue is determining the exact behavior we want out of $$crate and ensuring that's what's implemented. This is what reverted the first stabilization attempt, anyway.

This question is essentially whether $$crate should be equivalent to writing $crate in the crate where the macro which wrote $$crate or the crate which expanded that macro. https://github.com/rust-lang/rust/issues/99035

Linking some relevant comments:

  • @petrochenkov dislikes $$ impacting hygiene https://github.com/rust-lang/rust/pull/99445#issuecomment-1272302890
  • I don't model this as $$ impacting hygiene https://github.com/rust-lang/rust/pull/99445#issuecomment-1281172402
  • How to replace the $crate pseudo-token with a binder expansion, which will make things clearer https://github.com/rust-lang/rust/pull/99447#issuecomment-1204120473

TL;DR $$ cannot just be stabilized as-is without addressing $crate somehow.


The second question is how the metafunction interface should work; I don't think there's been a decision here. TL;DR is it ${ignore(binder)} or ${ignore($binder)}? (What "kind" level do the metafunctions operate at? It would be nice if this same syntax could be used for other metafunctionality in a somewhat uniform way; the big one is perhaps eager macro expansion.)

With a decision there, ${ignore} could be stabilized without $$.

CAD97 avatar Oct 17 '22 17:10 CAD97

Hello. Is there any progress on this issue? Maybe a min-stabilisation for index (is it possible to do it without solving other issues?)? As a user, I am dealing with a horrible super complex macro with 30 internal rules combining a push-down accumulator, a counter and quite a bunch of TT munching, to generate some description structures that are not that convoluted after all. Having the index variable would allow to get rid of a significant part of the complexity.

stephanemagnenat avatar Dec 16 '22 17:12 stephanemagnenat