sql-formatter icon indicating copy to clipboard operation
sql-formatter copied to clipboard

spark ddl with struct data type error

Open MasonMa-sy opened this issue 6 months ago • 3 comments

Describe the bug When I execute show create table a in spark, I get the ddl sql as below:

CREATE TABLE t_a (
a bigint,
b struct <c: bigint>
)

There is a ':' that is optional between field name and field type.

Actual behavior I input this sql in web demo.

There is a error.

An Unexpected Error Occurred
Parse error: Unexpected ": bigint> " at line 3 column 12. SQL dialect used: "spark".

Please report this at [Github issues page.](https://github.com/sql-formatter-org/sql-formatter/issues)

Stack Trace:

Error: Parse error: Unexpected ": bigint>
" at line 3 column 12.
SQL dialect used: "spark".
    at T.default.createParseError (https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:235390)
    at T.default.tokenize (https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:235156)
    at T.default.tokenize (https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:230135)
    at T.default.tokenize (https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:245337)
    at T.default.reset (https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:243864)
    at T.O.feed (https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:270080)
    at Object.parse (https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:245475)
    at T.default.parse (https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:16428)
    at T.default.format (https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:16326)
    at T.formatDialect (https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:264877)

MasonMa-sy avatar Sep 09 '25 07:09 MasonMa-sy

Thanks for reporting.

Could you point me to Spark documentation for this syntax?

nene avatar Sep 09 '25 13:09 nene

Thanks for reporting.

Could you point me to Spark documentation for this syntax?

Thanks for reply.

https://spark.apache.org/docs/latest/sql-ref-datatypes.html This doc descript the data type of spark. In SQL tab we can get the sql name of StructType.

Data type SQL name
StructType STRUCT<field1_name: field1_type, field2_name: field2_type, …> Note: ‘:’ is optional.

And we can get an example of create table sql in https://spark.apache.org/docs/latest/sql-ref-syntax-ddl-create-table-hiveformat.html

--Use complex datatype
CREATE EXTERNAL TABLE family(
        name STRING,
        friends ARRAY<STRING>,
        children MAP<STRING, INT>,
        address STRUCT<street: STRING, city: STRING>
    )
    ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ESCAPED BY '\\'
    COLLECTION ITEMS TERMINATED BY '_'
    MAP KEYS TERMINATED BY ':'
    LINES TERMINATED BY '\n'
    NULL DEFINED AS 'foonull'
    STORED AS TEXTFILE
    LOCATION '/tmp/family/';

MasonMa-sy avatar Oct 10 '25 03:10 MasonMa-sy

Thanks for the info.

Looks like this is similar to BigQuery STRUCT syntax, except that in BigQuery doesn't support the colon.

ARRAY<T> is the same as in BigQuery, and then there's the MAP<K, V> which is not supported by BigQuery.

nene avatar Oct 10 '25 13:10 nene