databend icon indicating copy to clipboard operation
databend copied to clipboard

feat: COPY formatTypeOptions support FIELD_OPTIONALLY_ENCLOSED_BY

Open BohuTANG opened this issue 3 years ago • 2 comments

Summary

formatTypeOptions ::=
  RECORD_DELIMITER = '<character>'
  FIELD_DELIMITER = '<character>'
  SKIP_HEADER = <integer>
  COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
  FIELD_OPTIONALLY_ENCLOSED_BY = '<character>' | NONE

For example, if the value is A "B" C, escape the double quotes as follows: A ""B"" C

Use case:

Not work in databend

COPY INTO hits.hits FROM 's3://clickhouse-public-datasets/hits_compatible/hits.csv.gz' FILE_FORMAT = (TYPE = 'CSV' field_delimiter=',' COMPRESSION = GZIP skip_header=1);

image

Works in snowflake

COPY INTO hits.public.hits2 FROM 's3://clickhouse-public-datasets/hits_compatible/hits.csv.gz' FILE_FORMAT = (TYPE = CSV COMPRESSION = GZIP FIELD_OPTIONALLY_ENCLOSED_BY = '"');

BohuTANG avatar Nov 05 '22 02:11 BohuTANG

cc @youngsofun

BohuTANG avatar Nov 05 '22 02:11 BohuTANG

the error is unexpected, CSV should handle it by default,(not require FIELD_OPTIONALLY_ENCLOSED_BY) it takes too long time to download https://datasets.clickhouse.com/hits_compatible/hits.csv.gz

is there a file fragment to reproduce the problem?

youngsofun avatar Nov 05 '22 11:11 youngsofun