byzer-lang icon indicating copy to clipboard operation
byzer-lang copied to clipboard

fix the errorIfExists keywords not match for DSLSQLLexer

Open hellozepp opened this issue 1 year ago • 1 comments

Issue description

  1. In Byzer project, the DSLSQL grammar defines a subrule errorIfExists: 'errorIfExists' and a token for the keyword ERRORIfExists:'errorIfExists'. However, due to the implementation of CaseChangingStream in Byzer, which converts all characters to lowercase, the keyword ERRORIfExists cannot be recognized properly in some contexts.
  2. Furthermore, because both the subrule and the token exist, and according to the longest-match principle and the priority order, the statement rule save (overwrite | append | errorIfExists | ignore)* might match the keyword errorIfExists in the subrule instead of the token in some contexts, leading to incorrect parsing.

Steps to reproduce

  1. Define a SQL statement that uses the save rule, such as SAVE errorIfExists table_name as.

Proposed solution

To resolve this issue, we propose to unify the usage of the errorIfExists keyword by changing the token definition to ERRORIFEXISTS:'errorifexists' and using the lowercase string literal consistently throughout the DSLSQL grammar.

hellozepp avatar Feb 28 '23 09:02 hellozepp

overwrite Test Case

set jsonStr1='''
{"features":[5.1,3.5,1.4,0.2],"label":0.0},
{"features":[5.1,3.5,1.4,0.2],"label":1.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[4.4,2.9,1.4,0.2],"label":0.0}
{"features":[5.1,3.5,1.4,0.2],"label":1.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[4.7,3.2,1.3,0.2],"label":1.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
''';

load jsonStr.`jsonStr1` as data;
save overwrite data as json.`/tmp/jack` where fileNum="10";

image

ignore Test Case

set jsonStr1='''
{"features":[5.1,3.5,1.4,0.2],"label":0.0},

''';

load jsonStr.`jsonStr1` as data;
save ignore data as json.`/tmp/jack` where fileNum="10";
load json.`/tmp/jack` as jackData;
select count(1) from jackData as output;

image

errorIfExists Test Case

set jsonStr1='''
{"features":[5.1,3.5,1.4,0.2],"label":0.0},
{"features":[5.1,3.5,1.4,0.2],"label":1.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[4.4,2.9,1.4,0.2],"label":0.0}
{"features":[5.1,3.5,1.4,0.2],"label":1.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[4.7,3.2,1.3,0.2],"label":1.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
''';

load jsonStr.`jsonStr1` as data;
save errorIfExists data as json.`/tmp/jack` where fileNum="10";

image

append Test Case

set jsonStr1='''
{"features":[5.1,3.5,1.4,0.2],"label":0.0},

''';

load jsonStr.`jsonStr1` as data;
save append data as json.`/tmp/jack`;
load json.`/tmp/jack` as jackData;
select count(1) from jackData as output;

image

hellozepp avatar Feb 28 '23 15:02 hellozepp