grammars-v4 icon indicating copy to clipboard operation
grammars-v4 copied to clipboard

Added TypeScript PhpLexerBase for php grammar

Open tomaspiaggio opened this issue 1 year ago • 3 comments
trafficstars

I added the TypeScript class for using the php grammar from the typescript target. It's very similar to the JavaScript one.

tomaspiaggio avatar Feb 29 '24 18:02 tomaspiaggio

This code doesn't compile, and even after fixing the compilation errors, it does not work.

Please make the following changes:

  • Add "TypeScript" to <targets> to the desc.xml.
diff --git a/php/desc.xml b/php/desc.xml
index 67714256..2f5d45a0 100644
--- a/php/desc.xml
+++ b/php/desc.xml
@@ -1,5 +1,5 @@
 <?xml version="1.0" encoding="UTF-8" ?>
 <desc xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="../_scripts/desc.xsd">
    <antlr-version>^4.10</antlr-version>
-   <targets>CSharp;Java;Python3</targets>
+   <targets>CSharp;Java;Python3;TypeScript</targets>
 </desc>
  • Modify the declaration of PhpLexerBase.
$ diff Generated-TypeScript/PhpLexerBase.ts ./TypeScript/
6c6
< export default abstract class PhpLexerBase extends Lexer {
---
> export class PhpLexerBase extends Lexer {
  • Fix "here docs". It is not lexing correctly. This is what the lexer produces:
$ bash run.sh ../examples/heredoc.php -tokens
[@0,0:4='<?php',<4>,channel=4,1:0]
[@1,5:6='\n\n',<39>,channel=4,1:5]
[@2,7:9='foo',<225>,3:0]
[@3,10:10='(',<212>,3:3]
[@4,11:21='<<< HEREDOC',<235>,3:4]
[@5,22:22='\n',<243>,3:15]
[@6,23:38='Heredoc line 1.\n',<243>,4:0]
[@7,39:54='Heredoc line 2.\n',<243>,5:0]
[@8,63:63='HEREDOC\n;',<213>,7:0]
[@9,64:64=';',<220>,7:1]
[@10,65:66='\n\n',<39>,channel=4,7:2]
[@11,67:69='foo',<225>,9:0]
[@12,70:70='(',<212>,9:3]
[@13,71:82='<<< 'NOWDOC'',<234>,9:4]
[@14,83:83='\n',<243>,9:16]
[@15,84:98='Nowdoc line 1.\n',<243>,10:0]
[@16,99:113='Nowdoc line 2.\n',<243>,11:0]
[@17,114:120='NOWDOC\n',<243>,12:0]
[@18,121:123=');\n',<243>,13:0]
[@19,124:124='\n',<243>,14:0]
[@20,125:139='$str = "asdf";\n',<243>,15:0]
[@21,140:140='\n',<243>,16:0]
[@22,141:160='$str1 = <<<HEREDOC1\n',<243>,17:0]
[@23,161:173='Hello world!\n',<243>,18:0]
[@24,174:183='HEREDOC1;\n',<243>,19:0]
[@25,184:184='\n',<243>,20:0]
line 21:0 token recognition error at: '?>'
[@26,187:186='<EOF>',<-1>,21:2]
TypeScript 0 ../examples/heredoc.php fail 0.021
Total Time: 0.043
03/01-17:16:12 ~/issues/g4-3994/php/Generated-TypeScript

Here is what it should be:

03/01-17:17:03 ~/issues/g4-3994/php/Generated-CSharp
$ bash run.sh ../examples/heredoc.php -tokens
[@0,0:4='<?php',<4>,channel=4,1:0]
[@1,5:6='\n\n',<39>,channel=4,1:5]
[@2,7:9='foo',<225>,3:0]
[@3,10:10='(',<212>,3:3]
[@4,11:21='<<< HEREDOC',<235>,3:4]
[@5,22:22='\n',<243>,3:15]
[@6,23:38='Heredoc line 1.\n',<243>,4:0]
[@7,39:54='Heredoc line 2.\n',<243>,5:0]
[@8,63:63='HEREDOC\n;',<213>,7:0]
[@9,64:64=';',<220>,7:1]
[@10,65:66='\n\n',<39>,channel=4,7:2]
[@11,67:69='foo',<225>,9:0]
[@12,70:70='(',<212>,9:3]
[@13,71:82='<<< 'NOWDOC'',<234>,9:4]
[@14,83:83='\n',<243>,9:16]
[@15,84:98='Nowdoc line 1.\n',<243>,10:0]
[@16,99:113='Nowdoc line 2.\n',<243>,11:0]
[@17,121:121='NOWDOC\n;',<213>,13:0]
[@18,122:122=';',<220>,13:1]
[@19,123:124='\n\n',<39>,channel=4,13:2]
[@20,125:128='$str',<224>,15:0]
[@21,129:129=' ',<39>,channel=4,15:4]
[@22,130:130='=',<221>,15:5]
[@23,131:131=' ',<39>,channel=4,15:6]
[@24,132:132='"',<233>,15:7]
[@25,133:136='asdf',<239>,15:8]
[@26,137:137='"',<233>,15:12]
[@27,138:138=';',<220>,15:13]
[@28,139:140='\n\n',<39>,channel=4,15:14]
[@29,141:145='$str1',<224>,17:0]
[@30,146:146=' ',<39>,channel=4,17:5]
[@31,147:147='=',<221>,17:6]
[@32,148:148=' ',<39>,channel=4,17:7]
[@33,149:159='<<<HEREDOC1',<235>,17:8]
[@34,160:160='\n',<243>,17:19]
[@35,161:173='Hello world!\n',<243>,18:0]
[@36,174:183='HEREDOC1;\n',<220>,19:0]
[@37,184:184='\n',<39>,channel=4,20:0]
[@38,185:186='?>',<220>,21:0]
[@39,187:186='<EOF>',<-1>,21:2]

CSharp 0 ../examples/heredoc.php success 0.0923946
Total Time: 0.2273878
03/01-17:17:16 ~/issues/g4-3994/php/Generated-CSharp

Note the large number of missing tokens in your TypeScript port.

NB: There is a bug in the other targets with the token text for the closing parentheses. That should be fixed, but you don't have to fix that.

  • The grammar contains code that is basically target specific in the lexer. https://github.com/antlr/grammars-v4/blob/60b76c9c0ddaad55ea1fcd377ab289b2826451fb/php/PhpLexer.g4#L51-L52. _scriptTag should be this._scriptTag, etc. (These should be wrapped in methods because "true" is not the correct syntax in some targets, but we'll ignore fixing that for now.)

kaby76 avatar Mar 01 '24 22:03 kaby76

@teverett hi, thanks for reviewing this. I'm not too familiar with this repository. I had to implement this and it was working for what I was doing so I thought of contributing it. I fixed some of the problems but I'm not sure how to fix the HEREDOC problem. Could you point me in the right direction? Thank you!

tomaspiaggio avatar Mar 08 '24 22:03 tomaspiaggio

Note, the above changes takes care of a number of problems but the code still fails for a couple of tests in the examples/ directory. I'm working out why.

kaby76 avatar Mar 09 '24 14:03 kaby76