rust-analyzer icon indicating copy to clipboard operation
rust-analyzer copied to clipboard

Incomplete syntax handling for proc-macros produces invalid `TokenStream` with `{`/`}` chars as a `Punct` rather than a `Group`

Open Veetaha opened this issue 1 year ago • 4 comments
trafficstars

rust-analyzer version: rust-analyzer version: 0.3.2129-standalon

rustc version: rustc 1.81.0 (eeb90cda1 2024-09-04)

editor or extension: VSCode v0.3.2129

I've stumbled on a bug when using my proc-macro #[bon::builder].

The minimal reproduction of this is:

#[my_lovely_macro]
fn example() {
    if 1
}

where my_lovely_macro is defined as this

use proc_macro::{TokenStream, TokenTree};

#[proc_macro_attribute]
pub fn my_lovely_macro(_params: TokenStream, item: TokenStream) -> TokenStream {
    let tokens: Vec<_> = item.clone().into_iter().collect();
    let TokenTree::Group(fn_block) = &tokens[3] else {
        unreachable!();
    };
    let fn_block_tokens: Vec<_> = fn_block.stream().into_iter().collect();

    if fn_block_tokens.len() > 2 {
        // This is the case where RA inserted a dummy {} block
        panic!("{fn_block_tokens:#?}");
    }

    item
}

If you take a look at the panic message from RA hints

it looks like this:

proc-macro panicked: [
    Ident {
        ident: "if",
        span: SpanData { range: 38..40, anchor: SpanAnchor(FileId(16777216), 2), ctx: SyntaxContextId(0) },
    },
    Literal {
        kind: Integer,
        symbol: "1",
        suffix: None,
        span: SpanData { range: 41..42, anchor: SpanAnchor(FileId(16777216), 2), ctx: SyntaxContextId(0) },
    },
    Punct {
        ch: '{',
        spacing: Alone,
        span: SpanData { range: 0..0, anchor: SpanAnchor(FileId(16777216), 4294967294), ctx: SyntaxContextId(0) },
    },
    Punct {
        ch: '}',
        spacing: Alone,
        span: SpanData { range: 0..0, anchor: SpanAnchor(FileId(16777216), 4294967294), ctx: SyntaxContextId(0) },
    },
]

Notice how there are two Punct items for { and } braces in this list? This is an invalid token tree. syn fails to parse this syntax, because curly braces are never meant to be Punct characters, they must always be balanced and they are only expected as part of a Group token tree.

This produced a panic in my bon::builder macro visible to the user writing the code if they have an incomplete if somewhere while writing the function body (the entire function is underlined in red and no IDE hints are provided). I'll fix that panic in bon separately to make my macro more bug-resilient to situations like this, but RA should also fix this.

Veetaha avatar Oct 05 '24 16:10 Veetaha

This probably comes from parser recovery. We need to either make it affect macro tt well, or better, not insert non-existing tokens.

ChayimFriedman2 avatar Oct 05 '24 16:10 ChayimFriedman2

It's not parser recovery; it's this: https://github.com/rust-lang/rust-analyzer/blob/5982d9c420d0dc90739171829f0d2e9c80d98979/crates/hir-expand/src/fixup.rs#L151-L161 (Original code by me, FIXME added by @Veykril and he's of course correct ;) )

flodiebold avatar Oct 05 '24 17:10 flodiebold

Some more context.

While digging a bit more I also uncovered a bug in proc_macro2 crate (https://github.com/dtolnay/proc-macro2/issues/470).

The standard library's proc_macro::Punct::new() explicitly panics if you pass an incorrect character to that method (see the code here). I suppose RA creates the proc_macro::Punct somehow unsafely by just relying on the Punct ABI layout, otherwise it's impossible to create a Punct with { and } characters.

The problem is worsened by the bug in proc_macro2 (https://github.com/dtolnay/proc-macro2/issues/470), which doesn't panic early when it encounters an invalid Punct.

Veetaha avatar Oct 05 '24 17:10 Veetaha

@rustbot release-assignment

This turned out way more nightmare-y than I thought and I have other things to do.

If anybody is interested in taking my work, my almost complete branch is on GitHub: https://github.com/ChayimFriedman2/rust-analyzer/tree/punct-brace. It passes all tests, but has a bug as the commit message explains. A simple fix may be to never omit a delimiter (and thus make bugs like this still possible, just more rare).

ChayimFriedman2 avatar Oct 06 '24 05:10 ChayimFriedman2

Encountered this myself (I think?). Is there any work around? It is unfortunate to have code completion not work with if statements, when a function has an attribute macro on it.

coolcatcoder avatar Feb 02 '25 01:02 coolcatcoder

@coolcatcoder

Is there any work around?

Yes. If you have code like this:

#[some_macro]
fn main () {
  // actual content
}

Then replace it with this code:

#[cfg_attr(not(rust_analyzer), some_macro)]
fn main () {
  // actual content
}

Also, you can just move actual function content to another function (this may work or not to work depending on macro). I. e. you can replace this:

#[some_macro]
fn main () {
  // actual content
}

With this:

fn real_main() {
  // actual content
}

#[some_macro]
fn main() {
  real_main()
}

safinaskar avatar Feb 02 '25 02:02 safinaskar

Thanks!

coolcatcoder avatar Feb 02 '25 03:02 coolcatcoder

You can just add the macro to the list of ignored macros, no need to cfg it.

ChayimFriedman2 avatar Feb 02 '25 09:02 ChayimFriedman2

@Veykril, when these changes will appear in nightly?

safinaskar avatar Mar 13 '25 11:03 safinaskar