antlr4 icon indicating copy to clipboard operation
antlr4 copied to clipboard

The syntax parsing time and performance did not meet my expectations

Open cmmjxh opened this issue 1 year ago • 2 comments

Hello, I have customized the syntax using Antlr4 (4.13.2), but when using the parser to parse, I found that the general performance loss for each syntax parsing is about 50ms. Our project can only allow us to control it within about 5ms. I am not sure if it is because our syntax definition is inaccurate, and the performance supported by Antlr4 can only reach this limit. Can you help answer my question?(java) `grammar DataFusion;

@header{ package org.example.code; }

options { language = Java; }

// DataFusion 语法定义jql jql: elements end ;

elements: element (',' element)* ; // Simplified to allow for easier parsing of multiple elements

element: ID ':' (strings | constant | function | expr) // Simplified and removed unnecessary alternatives | all | ignore ;

expr: term (op=('+'|'-') term)* | factor (op=(''|'/') factor) | number | strings | function | '(' expr ')' ;

term: factor (op=('+'|'-') factor) *; factor: number | strings | function | '(' expr ')' | ID;

function: AGGOPER '(' argument ')' | IFNULL '(' number ',' strings ',' argument ')' | CONCAT '(' strings (',' strings)* ')' ;

argument: strings | number | expr | constant ;

number: INTEGER | FLOAT ;

strings : ID ;

end : ';';

constant: ''' ID ''' | '''''' | ''' (INTEGER | FLOAT) ''';

all : '' | ID '' ;

ignore : '-$'ID | '-$'all;

AGGOPER: 'SUM' | 'sum' | 'AVG' | 'avg' | 'COUNT' | 'count' | 'MAX' | 'max' | 'MIN' | 'min' ;

IFNULL: 'IFNULL' | 'ifnull' ;

CONCAT: 'CONCAT' | 'concat';

FLOAT : '-'? DIGIT+ '.' DIGIT+ ; fragment DIGIT : [0-9] ; ID : [a-zA-Z0-9_$\u0080-\uffff.]+ ; Grammar_EOF: ';' ; WS: [ \t\r\n]+ -> skip;` 解析耗时:57ms

cmmjxh avatar Aug 26 '24 02:08 cmmjxh

The grammar you provide is not valid. It doesn't pass the Antlr Tool. Please use correct Markdown syntax for code blocks.

kaby76 avatar Aug 26 '24 12:08 kaby76

The grammar looks highly ambiguous and precedence looks incorrect. Use the recognizer options to report ambiguities s you parse then fix the grammar. Then you can use SLL mode and should teach your required parsing times easily. You have to work at a grammar and understand it to get good performance

On Mon, Aug 26, 2024 at 06:41 Ken Domino @.***> wrote:

The grammar you provide is not valid. It doesn't pass the Antlr Tool. Please use correct Markdown syntax for code blocks.

— Reply to this email directly, view it on GitHub https://github.com/antlr/antlr4/issues/4683#issuecomment-2310118078, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ7TMDIBPNTIDXVDFSNJE3ZTMO67AVCNFSM6AAAAABNDDRLRKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJQGEYTQMBXHA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jimidle avatar Aug 27 '24 18:08 jimidle