databend
databend copied to clipboard
bug: invalid unicode code point when loading files dump with format ndjson.
met error invalid unicode code point at line 1 column 256 when loading json dumped from hits table ( load from hits.csv.gz). both old and new ndjson loader.
COPY /*+set_var(enable_new_copy_for_text_formats=1) */INTO yxf.hits2 FROM 's3://clickhouse-public-datasets/hits_compatible/hits.csv.gz' FILE_FORMAT = (TYPE = 'CSV',COMPRESSION=AUTO);
copy into @json_stage/v1/ from hits2;
copy /*+set_var(enable_new_copy_for_text_formats=0) */ into hits_json from @json_stage/v1/;
Originally posted by @youngsofun in https://github.com/datafuselabs/databend/issues/14943#issuecomment-2025051348