starrocks
starrocks copied to clipboard
StarRocks export data to csv format error
Steps to reproduce the behavior (Required)
1. Objective
I want to import data from StarRocks into PostgreSQL, so I have taken the following steps:
-
Export the data from StarRocks in CSV format to S3, there are two approaches:
-
Use AWS PostgreSQL to import the CSV data from S3 into the database, the approach is as follows:
2. Basic Data
My data contains several VARCHAR fields, where the data stored includes common delimiters such as |, , and # within the strings.
Table structure:
CREATE TABLE test_table
(
id INT NULL,
lrn VARCHAR(255) NULL,
awb VARCHAR(255) NULL,
tracking_number VARCHAR(255) NULL,
recipient VARCHAR(65533) NULL,
sender VARCHAR(65533) NULL,
currency VARCHAR(255) NULL,
item VARCHAR(65533) NULL,
container VARCHAR(255) NULL,
created_at DATETIME NULL,
updated_at DATETIME NULL,
ioss VARCHAR(255) NULL,
parcel_no VARCHAR(255) NULL,
vat_number VARCHAR(255) NULL,
origin_ioss VARCHAR(255) NULL,
update_ioss_at DATETIME NULL,
archived TINYINT(1) NULL,
transport_cost DECIMAL(9, 2) NULL
)
Data as follows:
49321237,
FTLaaaaaaa256993,
235-111111934,
6G58638256993,
{"city": "| COI****THIER", "name": "an****HER", "address": "798 **** Mo", "country": "FR", "zipcode": "7***0", "address2": "", "cityCode": "73"},
{"city": "BEIJING", "name": "BEIJ *** LTD", "address": "R *** HENG", "country": "CN", "zipcode": "1***23", "address2": "SHI***R", "cityCode": "10"},
EUR,
{"count": 1, "value": 0.58, "hsCode": "6***9", "modrep": 3, "nbrart": 1, "nbrcol": 1, "weight": 0.187, "opedest": "O***ONNEL", "currency": "EUR", "describe": "CLOTHES", "nattrans": 11, "packagingType": ""},
FRFR1111111156,
2023-01-01 04:03:11.311006,
2023-01-01 04:03:11.311006,
\N,
\N,
\N,
\N,
\N,
\N,
false,
\N
4. EXPORT Plan
Export SQL
EXPORT TABLE test_table
PARTITION ('p202212')
TO "s3://test_bucket/**/s1/"
PROPERTIES
(
"column_separator"=","
)
WITH BROKER
(
"aws.s3.access_key" = "AAAAAAAAAAAA",
"aws.s3.secret_key" = "KKKKKKKKKKKKKKKKKKKKKKKK",
"aws.s3.region" = "eu-west-1"
);
CSV data
49318060,FTL611111151839,157-1111113,6G58111111119,{"city": "BaaaaST", "name": "ARNaaaaRY", "address": "5 SaaaaN", "country": "FR", "zipcode": "21110", "address2": "", "cityCode": "29"},{"city": "SHaaaaHAI", "name": "SHANaaaaLTD", "address": "Pudoaaaat", "country": "CN", "zipcode": "211115", "address2": "", "cityCode": "20"},EUR,{"count": 1, "value": 24.89, "hsCode": "3111110", "modrep": 3, "nbrart": 1, "nbrcol": 1, "weight": 0.624, "opedest": "OCaaaaaNEL", "currency": "EUR", "describe": "Diaaaaent", "nattrans": 11, "packagingType": ""},YWFTL11111111103001,2022-12-31 10:47:50.911922,2022-12-31 10:47:50.911922,R8j5uaiaaaaaaaaaHBFw==,\N,\N,\N,\N,true,0.00
The JSON data in the CSV contains ,, it should be enclosed with ":
"{""city"": ""BaaaaST"", ""name"": ""ARNaaaaRY""}"
5. INSERT INTO FILES Plan
sql
INSERT INTO
FILES(
"path" = "s3://test_bucket/**/insert",
"format" = "csv",
"csv.column_separator"=",",
"csv.enclose"='"',
"csv.escape"="\\",
"compression" = "uncompressed",
"target_max_file_size" = "1024", -- 1KB
"aws.s3.access_key" = "AKIAaaaaaaaaSMKS",
"aws.s3.secret_key" = "TpiojLaaaaaaaaaaaauDDaYZr",
"aws.s3.region" = "eu-west-1"
)
SELECT * FROM test_table LIMIT 1000;
csv data
49318068,FTL61111541685,157-1111113,6G511111111185,{"city": "ME11C", "name": "CaaETTE", "address": "10aa rige", "country": "FR", "zipcode": "11110", "address2": "", "cityCode": "19"},{"city": "SH11AI", "name": "SHAN11SS CO.LTD", "address": "P11 ort", "country": "CN", "zipcode": "2115", "address2": "", "cityCode": "20"},EUR,{"count": 1, "value": 14.09, "hsCode": "8111", "modrep": 3, "nbrart": 1, "nbrcol": 1, "weight": 0.34, "opedest": "OCC111EL", "currency": "EUR", "describe": "Camping light", "nattrans": 11, "packagingType": ""},YW11113058,2022-12-31 10:47:50.911922,2022-12-31 10:47:50.911922,GUub04OA1111E51kG+p2Sg==,\N,\N,\N,\N,true,0.00
The JSON data in the above CSV still appears incorrectly
Expected behavior (Required)
the csv export data should be like this:
49318048,FTL6G58753149446,157-87924513,6G58753149446,"{""city"": ""BaaaaST"", ""name"": ""ARNaaaaRY""}",\N,....
Real behavior (Required)
but the csv data is this:
49318060,FTL611111151839,157-1111113,6G58111111119,{"city": "BaaaaST", "name": "ARNaaaaRY"},\N,....
StarRocks version (Required)
3.3.4