Encountering issues while transferring data into Starrock from a CSV file. The data transfer type is stream_load.
Issue Description
privateIp :- FE_Url
Description of the Issue:
We created a Parquet/CSV file from SQL Server. While inserting the data into Starrocks, we are encountering an error.
SQL table structure id | productid | productName| 1 | 42 | Cookie | 2 | 43 | Ice cream, Frozen Desert|
CSV file structure: 1,42,Cookie 2,43,"Ice cream, Frozen Desert"
Sling version 1.2.11
Operating System linux
Replication Configuration:
Command Which i using to insert data into the starrocks `#!/bin/bash
Define the log file path
LOGFILE="/opt/sling/slingDataTranfserlog_19_06_24.log"
Iterate over each .csv file in the directory
for file in /opt/sling/csvnew/part.01.0460.csv; do echo "Uploading $file..." >> $LOGFILE 2>&1
Use curl to upload the file and capture the response
response=$(curl --location-trusted -u 'user:password'
-H "Expect: 100-continue"
-H "column_separator: ,"
-H "columns: id,product,productName"
-H "skip_header: 1"
-T "$file"
-X PUT
http://privateIP:8030/api/DatabaseName/TableName/_stream_load 2>&1)
Log the response
echo "$response" >> $LOGFILE 2>&1 done`
streams: Stream
source: CSV file ( Created from sql server)
target: Starrocks
streams:
...
- Log Output (please run command with
-d):
{
"TxnId": 1386825,
"Label": "37776833-c498-41e9-aa4e-2c81dec9eb33",
"Status": "Fail",
"Message": "too many filtered rows",
"NumberTotalRows": 100000,
"NumberLoadedRows": 99754,
"NumberFilteredRows": 246,
"NumberUnselectedRows": 0,
"LoadBytes": 5018672,
"LoadTimeMs": 243,
"BeginTxnTimeMs": 1,
"StreamLoadPlanTimeMs": 2,
"ReadDataTimeMs": 1,
"WriteDataTimeMs": 239,
"CommitAndPublishTimeMs": 0,
"ErrorURL": "http://privateIp:8040/api/_load_error_log?file=error_log_c244f8c539780e6f_8a3e7085f85dc593"
}
Error: Value count does not match column count: expected = 3, actual = 4. Column separator: ',', Row delimiter: '\n'. Row: 2,43,"Ice cream, Frozen Desert"
Hi, without a file, I cannot test. Can you produce a sample file that is erroring for you, and share it? So I can reproduce the error. You can email it to [email protected] if you prefer.
We have shared a sample dataset with [[email protected]]. Please check it.