rnix-parser icon indicating copy to clipboard operation
rnix-parser copied to clipboard

Handle /*-style comments in AST

Open nmattia opened this issue 3 years ago • 11 comments

This changes the Comment ast definition to handle comments like /* ... */, whereas before only # ...-style comments were handled without panicking. The implementation makes some assumptions and may break on non-conforming code, but well-formed comments should at least be handled without errors. A comment was added to flake.nix for testing (given the other comments, I assumed it was used for testing...).

To test this (before/after):

$ cargo run --quiet --example list-fns ./flake.nix

nmattia avatar Aug 23 '22 20:08 nmattia

The comments in the flake were put there by accident. You should create a test instead.

oberblastmeister avatar Aug 23 '22 21:08 oberblastmeister

Can you guide me a little here? I'm not very experienced with Rowan and at this point am just turning the knobs until something works.

This is what I have so far:

#[cfg(test)]
mod tests {
    use crate::Root;
    use crate::ast;
    use crate::match_ast;

    use super::*;
    use rowan::ast::AstNode;

    #[test]
    fn comment() {

        let inp = r#"# the comment"#;

        // fails here because no expression
        let expr = Root::parse(inp).ok().unwrap().expr().unwrap();
        let token = expr.syntax().first_token().unwrap();

        match_ast! {
            match token {
                ast::Comment(_comm) => {

                },
                _ => unreachable!(),
            }
        }
    }
}

how can I either just parse a "token", or alternatively parse a full expression and fish out the "token" without having to iterate through siblings as in list-fns?

nmattia avatar Aug 24 '22 08:08 nmattia

You can get the comments like this:

    let s = "# comment bruh
/* this is a multiline comment wow
asdfasdf
asdfasdf */
1 + 1
";
    let comments: Vec<_> = Root::parse(s)
        .ok()
        .unwrap()
        .syntax()
        .children_with_tokens()
        .filter_map(|e| match e {
            rowan::NodeOrToken::Token(token) => match_ast! { match token {
                ast::Comment(it) => Some(it.text().to_string()),
                _ => None,
            }},
            rowan::NodeOrToken::Node(_) => None,
        })
        .collect();

Some notes:

  • we use a useless placeholder expression so it parses
  • there are no siblings if it is the root expression, so use children_with_tokens

oberblastmeister avatar Aug 24 '22 15:08 oberblastmeister

The implementation makes some assumptions and may break on non-conforming code, but well-formed comments should at least be handled without errors.

Tokens must always be well formed or it is an internal bug in our tokenizer/parser.

oberblastmeister avatar Aug 24 '22 15:08 oberblastmeister

@oberblastmeister thanks a lot! I've added the test, let me know if this works for you.

nmattia avatar Aug 25 '22 09:08 nmattia

@nmattia Please address my comment

oberblastmeister avatar Aug 25 '22 14:08 oberblastmeister

@oberblastmeister which comment do you mean?

Tokens must always be well formed or it is an internal bug in our tokenizer/parser.

I've updated the comment in the code already

nmattia avatar Aug 25 '22 16:08 nmattia

@nmattia My review. I'm nitpicking but at least it removes an unwrap

oberblastmeister avatar Aug 25 '22 16:08 oberblastmeister

@oberblastmeister I'm sorry, I must really be blind today. I don't see a review submitted, nor any other comment than those:

  • The comments in the flake were put there by accident. You should create a test instead. https://github.com/nix-community/rnix-parser/pull/141#issuecomment-1224902486
  • You can get the comments like this: ... https://github.com/nix-community/rnix-parser/pull/141#issuecomment-1225847003
  • Tokens must always be well formed or it is an internal bug in our tokenizer/parser. https://github.com/nix-community/rnix-parser/pull/141#issuecomment-1225852589
  • Please address my comment
  • My review. I'm nitpicking but at least it removes an unwrap

I'm sure I'll facepalm as soon as you point out which one!

Re. the details, which unwrap do you want me to remove? Do you mean something like this?

{
        ...
        let foo = if text.starts_with("#") {
            text.strip_prefix('#')
        } else {
            text.strip_prefix(r#"/*"#).unwrap().strip_suffix(r#"*/"#)
        };

        foo.unwrap()
}

nmattia avatar Aug 25 '22 16:08 nmattia

@nmattia It's my fault. The review was pending and I forgot to submit it. You should see it now.

oberblastmeister avatar Aug 25 '22 17:08 oberblastmeister

Re. the details, which unwrap do you want me to remove? Do you mean something like this?

if s.starts_with("#") { s.strip_prefix("#").unwrap() } else { ... }

can be replaced by

match s.strip_prefix("#") { Some(s) => s, None => ... }

oberblastmeister avatar Aug 25 '22 17:08 oberblastmeister

@oberblastmeister there you go!

nmattia avatar Aug 26 '22 08:08 nmattia