JSqlParser
JSqlParser copied to clipboard
ParseException due to ARRAY_AGG() and SAFE_CAST()
Hi,
We recently found this ParseException due to the use of ARRAY_AGG() and SAFE_CAST() from the built-in functions provided by Google BigQuery. However, when I tried to validate the SQL syntax, I got this error:
ParseException: Cannot parse statement: Encountered unexpected token: "(" "(" at line 1, column 17.
Was expecting one of:
"&"
"::"
";"
"<<"
">>"
"ACTION"
"ACTIVE"
"ALGORITHM"
"ARCHIVE"
"ARRAY"
"AS"
"AT"
"BYTE"
"CASCADE"
"CASE"
"CAST"
"CHANGE"
"CHAR"
"CHARACTER"
"CHECKPOINT"
"COLLATE"
"COLUMN"
"COLUMNS"
"COMMENT"
"COMMIT"
"CONNECT"
"COSTS"
"CYCLE"
"DBA_RECYCLEBIN"
"DEFAULT"
"DESC"
"DESCRIBE"
"DISABLE"
"DISCONNECT"
"DIV"
"DO"
"DUMP"
"DUPLICATE"
"EMIT"
"ENABLE"
"END"
"EXCLUDE"
"EXTRACT"
"FALSE"
"FILTER"
"FIRST"
"FLUSH"
"FN"
"FOLLOWING"
"FORMAT"
"FULLTEXT"
"GROUP"
"HAVING"
"HISTORY"
"INDEX"
"INSERT"
"INTERVAL"
"INTO"
"ISNULL"
"JSON"
"KEY"
"LAST"
"LEADING"
"LINK"
"LOCAL"
"LOG"
"MATERIALIZED"
"NO"
"NOLOCK"
"NULLS"
"OF"
"OPEN"
"OVER"
"PARALLEL"
"PARTITION"
"PATH"
"PERCENT"
"PRECISION"
"PRIMARY"
"PRIOR"
"QUERY"
"QUIESCE"
"RANGE"
"READ"
"RECYCLEBIN"
"REGISTER"
"REPLACE"
"RESTRICTED"
"RESUME"
"ROW"
"ROWS"
"SCHEMA"
"SEPARATOR"
"SEQUENCE"
"SESSION"
"SHUTDOWN"
"SIBLINGS"
"SIGNED"
"SIZE"
"SKIP"
"START"
"SUSPEND"
"SWITCH"
"SYNONYM"
"SYSTEM"
"TABLE"
"TABLESPACE"
"TEMP"
"TEMPORARY"
"TIMEOUT"
"TO"
"TOP"
"TRUE"
"TRUNCATE"
"TRY_CAST"
"TYPE"
"UNQIESCE"
"UNSIGNED"
"USER"
"VALIDATE"
"VALUE"
"VALUES"
"VIEW"
"WINDOW"
"XML"
"ZONE"
"["
"^"
"|"
<EOF>
<K_DATETIMELITERAL>
<K_DATE_LITERAL>
<K_NEXTVAL>
<K_STRING_FUNCTION_NAME>
<S_CHAR_LITERAL>
<S_IDENTIFIER>
<S_QUOTED_IDENTIFIER>
The SQL statement looks like this:
select array_agg(safe_cast(n as Float64)) as temp from UNNEST(SPLIT("0,0,0,0,0",",")) as n
We are wondering is there a way to allow ARRAY_AGG() and SAFE_CAST() without throwing exceptions?
Thank you very much in advance, Tianqi
Greetings,
The problem is about SAFE_CAST(x AS y), which is not a regular function but a special expression.
We will need to extend the CAST() expression accordingly.
Greetings,
Thanks a lot for quickly identifying the root cause of the problem. Just would like to follow up with you and see when approximately will this change be applied / released? Any details would be greatly appreciated.
Best regards, Tianqi
Hi @manticore-projects , not only SAFE_CAST, but also ARRAY_AGG can not be fully correctly parsed.
failed cases:
ARRAY_AGG(x LMIT 10)ARRAY_AGG( x IGNORE NULLS ORDER BY x)ARRAY_AGG( DISTINCT x)Google ARRAY_AGG references: https://cloud.google.com/bigquery/docs/reference/standard-sql/aggregate_functions#array_agg JsqlParser Version: 4.5
1. `ARRAY_AGG(x LMIT 10)` 2. `ARRAY_AGG( x IGNORE NULLS ORDER BY x)` 3. `ARRAY_AGG( DISTINCT x)` Google ARRAY_AGG references: https://cloud.google.com/bigquery/docs/reference/standard-sql/aggregate_functions#array_agg
Certainly, because all of those are mot standard compliant, but Google BigQuery specific. You will need to sponsor or provide an implementation for such special function parameters.
@manticore-projects Thanks, I add SAFE_CAST support and create a PR.
Do I get this right, that this issue is not yet complete by the PR?
@wumpz Yes, It only solves SAFE_CAST related problem, ARRAY_AGG is not included in this PR.
Continued as Feature request under #1856