datacontract-cli icon indicating copy to clipboard operation
datacontract-cli copied to clipboard

Import: No support of AWS Athena (Trino) DDLs

Open roykoand opened this issue 1 year ago • 4 comments

It's not an issue of this project but of the underlying dependency - simple_ddl_parser (https://github.com/xnuinside/simple-ddl-parser)

It does not have support of DDLs generated by AWS Athena (SHOW CREATE TABLE).

Using this DDL as an example:

CREATE EXTERNAL TABLE `database`.`table` (
    column1 string,
    column2 string
)
PARTITIONED BY
(
    column3 integer
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
ESCAPED BY '\\'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION 's3://somewhere-in-s3/prefix1'
TBLPROPERTIES (
  'parquet.compression'='GZIP'
)
$ datacontract import --format sql --source aws_athena_ddl.sql
...
DDLParserError: Unknown symbol "'"

If you delete everything except columns definitions, it's still providing an invalid output:

CREATE EXTERNAL TABLE `database`.`table` (
    column1 string,
    column2 string
)
PARTITIONED BY
(
    column3 integer
)
$ datacontract import --format sql --source aws_athena_ddl.sql
dataContractSpecification: 0.9.3
id: my-data-contract-id
info:
  title: My Data Contract
  version: 0.0.1
models:
  '`table`':
    type: table
    fields:
      column1:
        type: string
      column2:
        type: string

roykoand avatar Jul 17 '24 11:07 roykoand

Thanks for reporting. I think best way is to open n issue (and maybe even PR) at simple_ddl_parser

roykoand could you do so?

jochenchrist avatar Jul 18 '24 15:07 jochenchrist

@jochenchrist Sure! Just created a feature request in their repo: https://github.com/xnuinside/simple-ddl-parser/issues/272

roykoand avatar Jul 26 '24 15:07 roykoand

fyi: was fixed in version 1.6.0 in simple-ddl-parser

xnuinside avatar Aug 13 '24 10:08 xnuinside

Merged #372

@roykoand Could you test with the current main version, if this solves your issue?

jochenchrist avatar Aug 13 '24 13:08 jochenchrist