Sample for parsing comments
hi,
I was looking for example of parsing comments.
Single line // And this will be a comment
And multiline
/*
Comment
*/*
Still a comment
*/
Closing of the second comment, but still in the first comment
*/
Might require some "counting" and a custom parser context to maintain this counter.
Example in Fluid: https://github.com/sebastienros/fluid/blob/main/Fluid/FluidParser.cs#L226-L247
This is the context class: https://github.com/sebastienros/fluid/blob/main/Fluid/Parser/FluidParseContext.cs
The idea is that when you find /* you increment a counter and decrement it when you find */. But the result of the parser can differ if the counter is back to zero. And if you reach EOF while the counter is positive then everything is a comment, or you can return an error.
@jorgeleo I implemented simple and configurable comment parser:
WhiteSpaceWithCommentsParser = Literals.ExtendedWhiteSpace(
s => s.SkipWhiteSpaceOrNewLine(),
s => s.SkipSingleLineComment("--"),
s => s.SkipMultiLineComment("/*", "*/", true)
).Then(static x => x.ToString());
WhiteSpaceWithCommentsParserNoNesting = Literals.ExtendedWhiteSpace(
s => s.SkipWhiteSpaceOrNewLine(),
s => s.SkipSingleLineComment("--"),
s => s.SkipMultiLineComment("/*", "*/", false)
).Then(static x => x.ToString());
https://github.com/lampersky/UsefulParlotParsers/blob/main/src/Lampersky.UsefulParlotParsers.Tests/ExtendedWhiteSpaceParserTests.cs
/cc @sebastienros
I added this so we can set any parser to handle WS/comments.
Next is to have these parser @lampersky available to pass to this new extension. I was thinking of making independent parsers that are optimized for parsing:
- using a vector search on
\n|EOF, reusingNoneOf('\n'), but maybe a dedicated on that takes a whole string instead of chars - anything before this text (
*/), again using the same new parser
Then with the new parser we could add custom helpers for building standard comments support in the extensions
@sebastienros WithWhiteSpaceParser works great! If you have time, please have a look if approach with comment parsing logic inside the scanner (and parser) is ok, or if you are thinking about something else.
Here all possible comments known from sql language are tested.
When will Parlot 1.5.2 with those latest extensions be shipped
Are there any languages that support nested comments? Tried c#, js and sql with no luck.
this is a screenshot from MS SQL Server Management Studio:
as you can see, you can nest a multiline comment inside another one, in c# and JS it won't work
Copilot tells me otherwise. Maybe it's just a feature of SQL Manager Studio
In SQL, multi-line comments are typically enclosed between /* and */. However, nested multi-line comments (placing one multi-line comment inside another) are not supported in most SQL implementations, including SQL Server.
The doc says it is supported https://learn.microsoft.com/en-us/sql/t-sql/language-elements/slash-star-comment-transact-sql?view=sql-server-ver16#remarks