rustfmt icon indicating copy to clipboard operation
rustfmt copied to clipboard

`format_macro_matchers` unusual behavior with commas

Open tgross35 opened this issue 2 years ago • 3 comments

For a struct-like macro:

macro_rules! foo {
    (
        foo: $foo:literal,
        bar: $bar:literal,
        baz: $baz:literal,
        qux: $qux:literal,
        quux: $quux:literal,
        corge: $corge:literal,
    ) => {};
}

Formatting with format_macro_matchers = true produces weird results:

macro_rules! foo {
    (
        foo:
        $foo:literal,bar:
        $bar:literal,baz:
        $baz:literal,qux:
        $qux:literal,quux:
        $quux:literal,corge:
        $corge:literal,
    ) => {};
}

It seems like it always removes whitespace after a comma rather than using it as a possible break location. Single-line macros have similarly weird results:

macro_rules! foo {
    (foo: $foo:ident,bar: $bar:ident,baz: $baz:ident,qux: $qux:ident,) => {};
}

rustfmt 1.7.0-nightly (f704f3b9 2023-12-19)

tgross35 avatar Dec 21 '23 04:12 tgross35

Linking the tracking issue for format_macro_matchers (https://github.com/rust-lang/rustfmt/issues/3354)

ytmimi avatar Dec 21 '23 17:12 ytmimi

Problem

This seems to be caused by the intersection of 2 things:

an assumption in the macro parsing code, where we always insert a 'separator' before parsing a meta-variable ($foo): https://github.com/rust-lang/rustfmt/blob/728939191e4218e2c1296c7ba3eb36590cbcb9bd/src/macros.rs#L907-L910 This 'forces' the formatter to try and keep the $idents at the start of each line.

and, what seems to me, like a bug in the token rules. Running the provided snippet through the macro parser gives the following tokens.

[ParsedMacroArg { kind: Delimited(Parenthesis, [
    ParsedMacroArg { kind: Separator("foo:", "") },
    ParsedMacroArg { kind: MetaVariable("literal", "foo") },
    ParsedMacroArg { kind: Separator(", bar:", "") },
    ParsedMacroArg { kind: MetaVariable("literal", "bar") },
// below tokens are just repititions...
    ParsedMacroArg { kind: Separator(", baz:", "") },
    ParsedMacroArg { kind: MetaVariable("literal", "baz") },
    ParsedMacroArg { kind: Separator(", qux:", "") },
    ParsedMacroArg { kind: MetaVariable("literal", "qux") },
    ParsedMacroArg { kind: Separator(", quux:", "") },
    ParsedMacroArg { kind: MetaVariable("literal", "quux") },
    ParsedMacroArg { kind: Separator(", corge:", "") },
    ParsedMacroArg { kind: MetaVariable("literal", "corge") },
    ParsedMacroArg { kind: Other(",", "") }
]) }]

What Happens

I believe these are the root of the problem: ParsedMacroArg { kind: Separator(", bar:", "") },

This expression (, bar:) should be getting parsed into a Separator(", ") and an Other("bar:"), but they're not. And because of this the entire thing gets treated like it's just a comma.

This is why when it's rewriting the expression, it emits: $foo:literal,bar: Because as far as it knows it's writing $foo:literal,.

The solution (Part of it at least)

I'm going to try and update the token rules, so that , bar: gets parsed like I suggested: into Separator(", ") and an Other("bar:").

InsertCreativityHere avatar Apr 15 '24 14:04 InsertCreativityHere

@InsertCreativityHere thanks for taking the time to look into this, and for the clear explanation of the issue

ytmimi avatar Apr 15 '24 15:04 ytmimi