ohm icon indicating copy to clipboard operation
ohm copied to clipboard

Do we get comments from ohm?

Open fwx5618177 opened this issue 1 year ago • 7 comments

I try to get comments on code, but its CST looks like can't get these.

fwx5618177 avatar May 23 '24 14:05 fwx5618177

Comments should definitely be included in your CST. Unfortunately, without seeing at least a snippet from your grammar, I can't tell you what's going on.

A common thing for people to do is to extend the space rule with whatever syntax they have for comments. As an example:

  space
   += comment

  comment
    = "/*" (~"*/" any)* "*/"  -- multiLine
    | "//" (~"\n" any)*       -- singleLine

Here, you'd be able to get to the comments by writing semantic actions for comment_multiLine and comment_singleLine.

Hope that helps!

alexwarth avatar May 23 '24 19:05 alexwarth

Comments should definitely be included in your CST. Unfortunately, without seeing at least a snippet from your grammar, I can't tell you what's going on.

A common thing for people to do is to extend the space rule with whatever syntax they have for comments. As an example:

  space
   += comment

  comment
    = "/*" (~"*/" any)* "*/"  -- multiLine
    | "//" (~"\n" any)*       -- singleLine

Here, you'd be able to get to the comments by writing semantic actions for comment_multiLine and comment_singleLine.

Hope that helps!

Now I set this:

    ProgramItem = Struct
                | Contract
                | Primitive
                | StaticFunction
                | NativeFunction
                | ProgramImport
                | Trait
                | Constant
                | Comment
                
    Statement = StatementLet
              | StatementBlock
              | StatementReturn
              | StatementExpression
              | StatementAssign
              | StatementAugmentedAssign
              | StatementCondition
              | StatementWhile
              | StatementRepeat
              | StatementUntil
              | StatementTry
              | StatementForEach
              | Comment
    // Comments
  lineTerminator = "\n" | "\r\n" | "\r" | "\u2028" | "\u2029"
  Comment = MultiLineComment | SingleLineComment | SingleLineDocComment | SingleLineImportantComment
  MultiLineComment = "/*" (~"*/" any)* "*/"
  SingleLineComment = "//" ~(("!" | "/") any) (~lineTerminator any)*
  SingleLineDocComment = "///" (~lineTerminator any)*
  SingleLineImportantComment = "//!" (~lineTerminator any)*

Then set it:

semantics.addOperation<ASTNode>('resolve_program_item', {
   MultiLineComment(_open, commentText, _close) {
       return createNode({
           kind: 'multiLineComment',
           value: commentText.sourceString.trim(),
           ref: createRef(this),
       });
   },
   SingleLineComment(_open, commentText) {
       return createNode({
           kind: 'singleLineComment',
           value: commentText.sourceString,
           ref: createRef(this),
       });
   },
   SingleLineDocComment(_open, commentText) {
       return createNode({
           kind: 'singleLineDocComment',
           value: commentText.sourceString.trim(),
           ref: createRef(this),
       });
   },
   SingleLineImportantComment(_open, commentText) {
       return createNode({
           kind: 'singleLineImportantComment',
           value: commentText.sourceString.trim(),
           ref: createRef(this),
       });
   },

But finally got the result:

{
 id: 3,
 kind: 'program',
 entries: [
   {
     id: 1,
     kind: 'multiLineComment',
     value: 'This is a multi-line comment',
     ref: ASTRef {}
   },
   {
     id: 2,
     kind: 'singleLineComment',
     value: 'This is a single line comment\n' +
       '                    //! This is a single line important comment\n' +
       '                    /// This is a single line doc comment\n' +
       '            fun testFunc(a: Int): Int {\n' +
       '                let b: Int = a == 123 ? 1 : 2;\n' +
       '                return b;\n' +
       '            }',
     ref: ASTRef {}
   }
 ]
}

fwx5618177 avatar May 24 '24 03:05 fwx5618177

image

fwx5618177 avatar May 24 '24 03:05 fwx5618177

Please take a look at this page, which discusses the difference between syntactic and lexical rules: https://ohmjs.org/docs/syntax-reference#syntactic-lexical

Your comment rules are syntactic (their names begin w/ a capital letter) which means they're implicitly skipping spaces. I don't think that's what you want.

(The idiom that I showed you in my first message, where you extend the space rule is a good one to use for this sort of thing.)

alexwarth avatar May 24 '24 06:05 alexwarth

Please take a look at this page, which discusses the difference between syntactic and lexical rules: https://ohmjs.org/docs/syntax-reference#syntactic-lexical

Your comment rules are syntactic (their names begin w/ a capital letter) which means they're implicitly skipping spaces. I don't think that's what you want.

(The idiom that I showed you in my first message, where you extend the space rule is a good one to use for this sort of thing.)

Thanks buddy. Currently I write one prettier-plugin for tact.

fwx5618177 avatar May 24 '24 07:05 fwx5618177

Please take a look at this page, which discusses the difference between syntactic and lexical rules: https://ohmjs.org/docs/syntax-reference#syntactic-lexical

Your comment rules are syntactic (their names begin w/ a capital letter) which means they're implicitly skipping spaces. I don't think that's what you want.

(The idiom that I showed you in my first message, where you extend the space rule is a good one to use for this sort of thing.)

I use this:

Comment {
    space += comment
    comment = "//" (~lineTerminator any)* -- singleLine
        | "/*" (~"*/" any)* "*/"  -- multiLine
    lineTerminator = "\n" | "\r\n" | "\r" | "\u2028" | "\u2029"
}

But it still throw error:

const grammar = rawGrammar.createSemantics();
            const semantics = grammar.addOperation('extractComment', {
                comment(arg0) {
                    return arg0.sourceString;
                },
            });
            const matchResult = rawGrammar.match(`// This is a single line comment
1211
            const a = 1;
            `);
            if (matchResult.failed()) {
                console.log('Error:', matchResult.message, matchResult.shortMessage);
            }
            const comment = semantics(matchResult).extractComment();
image

fwx5618177 avatar May 24 '24 07:05 fwx5618177

I'm trying to process comments too, and I'm a bit baffled by the comments from @alexwarth: as far as I can tell, anything under space is skipped, as per the comment in #448; certainly, whatever I do I can't seem to get a rule for a comment that is itself part of space to trigger.

That is, comments are available in a MatchResult that you get back from grammar.match, but not in FormatterOperations which is returned by a semantics. So you can't write semantic actions for them.

rrthomas avatar Aug 27 '24 20:08 rrthomas

We used to support this, but we temporarily disabled the feature at some point and it was never properly added back. We should definitely do it though. Tracking that here: #520.

pdubroy avatar Jul 29 '25 11:07 pdubroy