grammars-v4
grammars-v4 copied to clipboard
[PlSql] "REM", "REMARK", "PRO", "PROMPT" can not be a identifier
REM, REMARK, PRO, PROMPT
These words cannot be a identifier. such as
SELECT REMARK FROM T1
This sql will be parsed error in Antlr. But it is a correct sql in Oracle. I checked the PlSqlLexer.g4 , However ,These words are not defined as keywords. So, What's going on this
it seems like this code caused :
// https://docs.oracle.com/cd/E11882_01/server.112/e16604/ch_twelve034.htm#SQPUG054
REMARK_COMMENT: 'REM' {this.IsNewlineAtPos(-4)}? 'ARK'? (' ' ~('\r' | '\n')*)? NEWLINE_EOF -> channel(HIDDEN);
// https://docs.oracle.com/cd/E11882_01/server.112/e16604/ch_twelve032.htm#SQPUG052
PROMPT_MESSAGE: 'PRO' {this.IsNewlineAtPos(-4)}? 'MPT'? (' ' ~('\r' | '\n')*)? NEWLINE_EOF;
line 2450 in PlSqlLexer.g4
CREATE TABLE AGMT (LIMIT_FLAG VARCHAR2(16), REMARK VARCHAR2(512))
it's ok , but
CREATE TABLE AGMT (LIMIT_FLAG VARCHAR2(16),
REMARK VARCHAR2(512))
it's not ok.
the only difference is \n before REMARK in the second sql
it seems like this code caused :
// https://docs.oracle.com/cd/E11882_01/server.112/e16604/ch_twelve034.htm#SQPUG054 REMARK_COMMENT: 'REM' {this.IsNewlineAtPos(-4)}? 'ARK'? (' ' ~('\r' | '\n')*)? NEWLINE_EOF -> channel(HIDDEN); // https://docs.oracle.com/cd/E11882_01/server.112/e16604/ch_twelve032.htm#SQPUG052 PROMPT_MESSAGE: 'PRO' {this.IsNewlineAtPos(-4)}? 'MPT'? (' ' ~('\r' | '\n')*)? NEWLINE_EOF;line 2450 in PlSqlLexer.g4
@kaby76 hello, i found you just modified this code about 3 month ago, could you please help me to figure this question out in your spare time? Thank you so much !
it seems like this code caused :
// https://docs.oracle.com/cd/E11882_01/server.112/e16604/ch_twelve034.htm#SQPUG054 REMARK_COMMENT: 'REM' {this.IsNewlineAtPos(-4)}? 'ARK'? (' ' ~('\r' | '\n')*)? NEWLINE_EOF -> channel(HIDDEN); // https://docs.oracle.com/cd/E11882_01/server.112/e16604/ch_twelve032.htm#SQPUG052 PROMPT_MESSAGE: 'PRO' {this.IsNewlineAtPos(-4)}? 'MPT'? (' ' ~('\r' | '\n')*)? NEWLINE_EOF;line 2450 in PlSqlLexer.g4
@kaby76 hello, i found you just modified this code about 3 month ago, could you please help me to figure this question out in your spare time? Thank you so much !
The change I made is unrelated to this problem. All I did was was to rename self. to this. for those two rules in order to put the grammar into "target agnostic format".
REMARK_COMMENT was added long before, first here: https://github.com/antlr/grammars-v4/commit/3f0150f57505dde0792739e79c0030a8c912e425#diff-f0a9ac045571f25a6a3533b5c47d6287b7096503195911cb1e1e927a6a5a12c9R2324
A predicate was then added the same day here: https://github.com/antlr/grammars-v4/commit/356f3ea19e3c62fa92e1f3c7997daa8ec7711ad9#diff-f0a9ac045571f25a6a3533b5c47d6287b7096503195911cb1e1e927a6a5a12c9R2328
Then, it was changed again the day after to what it was until I changed it: https://github.com/antlr/grammars-v4/commit/356f3ea19e3c62fa92e1f3c7997daa8ec7711ad9#diff-f0a9ac045571f25a6a3533b5c47d6287b7096503195911cb1e1e927a6a5a12c9R2328
I will look at it over the weekend. My first impression, though, is that REM and PRO are parser-state aware lexing because it's not just that you have to look for the previous newline chars, but verify that it's not part of a statement. This is one of the things Antlr does not do well at all.
it seems like this code caused :
// https://docs.oracle.com/cd/E11882_01/server.112/e16604/ch_twelve034.htm#SQPUG054 REMARK_COMMENT: 'REM' {this.IsNewlineAtPos(-4)}? 'ARK'? (' ' ~('\r' | '\n')*)? NEWLINE_EOF -> channel(HIDDEN); // https://docs.oracle.com/cd/E11882_01/server.112/e16604/ch_twelve032.htm#SQPUG052 PROMPT_MESSAGE: 'PRO' {this.IsNewlineAtPos(-4)}? 'MPT'? (' ' ~('\r' | '\n')*)? NEWLINE_EOF;line 2450 in PlSqlLexer.g4
@kaby76 hello, i found you just modified this code about 3 month ago, could you please help me to figure this question out in your spare time? Thank you so much !
The change I made is unrelated to this problem. All I did was was to rename
self.tothis.for those two rules in order to put the grammar into "target agnostic format".REMARK_COMMENT was added long before, first here: 3f0150f#diff-f0a9ac045571f25a6a3533b5c47d6287b7096503195911cb1e1e927a6a5a12c9R2324
A predicate was then added the same day here: 356f3ea#diff-f0a9ac045571f25a6a3533b5c47d6287b7096503195911cb1e1e927a6a5a12c9R2328
Then, it was changed again the day after to what it was until I changed it: 356f3ea#diff-f0a9ac045571f25a6a3533b5c47d6287b7096503195911cb1e1e927a6a5a12c9R2328
I will look at it over the weekend. My first impression, though, is that REM and PRO are parser-state aware lexing because it's not just that you have to look for the previous newline chars, but verify that it's not part of a statement. This is one of the things Antlr does not do well at all.
@KvanTTT ok, I get it. Thanks for your reply. My current way is to temporarily remove these two lexical definitions
I think the problem here is that REMARK and PROMPT should be considered commands. They're not really comments. So, I think you're right, the rules should not be there in the lexer.