Try to load data from a csv into StarRocks。but the last `'` remains after loading
Try to load data from a csv into StarRocks, below is one of them。After loading I find the data in StarRocks keep the last '。
/wp-admin/edit.php?post_type=post,'[{"support": 1.0, "item": 23802228611909434500675223637494144312}, {"support": 1.0, "item": 174472487042906779548867851357749048616}, {"supp
ort": 0.967741935483871, "item": 218473117571251542901396678363898265937}, {"support": 0.967741935483871, "item": 231554223331928306785505632432118524085}, {"support": 0.93548
38709677419, "item": 251450111408553977209416691192262766043}, {"support": 0.8709677419354839, "item": 255844660388155386531595464064842665953}, {"support": 0.8709677419354839
, "item": 43431313178276204057415236603873012077}, {"support": 0.8387096774193549, "item": 134183800354940827219830043105011992932}, {"support": 0.8064516129032258, "item": 12
4252827735266096217734490785960544895}, {"support": 0.7741935483870968, "item": 308786225706730603494773708575056993695}, {"support": 0.7741935483870968, "item": 1075746400190
33010051536354268230337001}, {"support": 0.7419354838709677, "item": 257105383019231995168090203574525649981}, {"support": 0.7419354838709677, "item": 559424449179993410490394
93638467366617}, {"support": 0.7419354838709677, "item": 241284497872922364543607046042548588103}, {"support": 0.7096774193548387, "item": 224156408131500668504211608160659624
317}, {"support": 0.6774193548387096, "item": 178058453208626738767032585709146263373}, {"support": 0.7419354838709677, "item": 332783421630035866156352528202461149697}]'
Steps to reproduce the behavior (Required)
-
create table
create table test.frequent_itemset_str2( id bigint not null AUTO_INCREMENT, url STRING not NULL, itemsets STRING not NULL ) ENGINE = olap PRIMARY KEY (id); -
import data
curl --location-trusted -u root \ -T ./test.csv \ -H "column_separator:," \ -H "skip_header:1" \ -H "enclose:'" \ -H "max_filter_ratio:1" \ -H "columns: url, itemsets" \ -XPUT http://127.0.0.1:8030/api/test/frequent_itemset_str/_stream_load -
query
SELECT id, url, parse_json(itemsets) FROM deepflow.frequent_itemset_str; id |url |parse_json(itemsets)| ---+--------------------------------------------------+--------------------+ 609|/wp-admin/edit.php?post_type=post&trashed=1&ids=53| | 610|/wp-admin/edit.php?post_type=post&trashed=1&ids=65| | 608|/wp-admin/edit.php?post_type=post&author=1 | |contents in column itemsets should be json,but they failed to transform because of the
'
Expected behavior (Required)
Real behavior (Required)
StarRocks version (Required)
3.3.0-19a3f66
A simple example
The csv file & the load scripts
Query table
@jaogoy
We have marked this issue as stale because it has been inactive for 6 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to StarRocks!
@asdfsx If the data is correct, and you are sure the enclose is not correctly processed. It should be a bug about the enclose.
cc @wyb
I almost forget everything about the issue, since it has passed so long. I think the data is correct, but I don't have the environment to reproduce the problem right now. And I think reproduce the problem is very easy. If u find this isn't a bug, or it have been repaired, I can close the issue. @jaogoy
@wyb you can check it whether it's a bug about enclose.
I ran into the same issue. After some investigation, I found out that the problem only occurs when the csv file is in a Windows format (CR + LF).
If the file is an Linux format (LF), it works fine. It seems to be linked to the end of line character.