sling-cli icon indicating copy to clipboard operation
sling-cli copied to clipboard

Encountering issues while transferring data into Starrock from a CSV file. The data transfer type is stream_load.

Open pawan-chauhan-9560 opened this issue 1 year ago • 2 comments

Issue Description

privateIp :- FE_Url

Description of the Issue:

We created a Parquet/CSV file from SQL Server. While inserting the data into Starrocks, we are encountering an error.

SQL table structure id | productid | productName| 1 | 42 | Cookie | 2 | 43 | Ice cream, Frozen Desert|

CSV file structure: 1,42,Cookie 2,43,"Ice cream, Frozen Desert"

Sling version 1.2.11

Operating System linux

Replication Configuration:

Command Which i using to insert data into the starrocks `#!/bin/bash

Define the log file path

LOGFILE="/opt/sling/slingDataTranfserlog_19_06_24.log"

Iterate over each .csv file in the directory

for file in /opt/sling/csvnew/part.01.0460.csv; do echo "Uploading $file..." >> $LOGFILE 2>&1

Use curl to upload the file and capture the response

response=$(curl --location-trusted -u 'user:password'
-H "Expect: 100-continue"
-H "column_separator: ,"
-H "columns: id,product,productName"
-H "skip_header: 1"
-T "$file"
-X PUT
http://privateIP:8030/api/DatabaseName/TableName/_stream_load 2>&1)

Log the response

echo "$response" >> $LOGFILE 2>&1 done`

streams: Stream

source:  CSV file ( Created from sql server)
target: Starrocks
streams:
  ...
  • Log Output (please run command with -d):
{
    "TxnId": 1386825,
    "Label": "37776833-c498-41e9-aa4e-2c81dec9eb33",
    "Status": "Fail",
    "Message": "too many filtered rows",
    "NumberTotalRows": 100000,
    "NumberLoadedRows": 99754,
    "NumberFilteredRows": 246,
    "NumberUnselectedRows": 0,
    "LoadBytes": 5018672,
    "LoadTimeMs": 243,
    "BeginTxnTimeMs": 1,
    "StreamLoadPlanTimeMs": 2,
    "ReadDataTimeMs": 1,
    "WriteDataTimeMs": 239,
    "CommitAndPublishTimeMs": 0,
    "ErrorURL": "http://privateIp:8040/api/_load_error_log?file=error_log_c244f8c539780e6f_8a3e7085f85dc593"
}


Error: Value count does not match column count: expected = 3, actual = 4. Column separator: ',', Row delimiter: '\n'. Row: 2,43,"Ice cream, Frozen Desert"

pawan-chauhan-9560 avatar Jun 20 '24 05:06 pawan-chauhan-9560

Hi, without a file, I cannot test. Can you produce a sample file that is erroring for you, and share it? So I can reproduce the error. You can email it to [email protected] if you prefer.

flarco avatar Jun 20 '24 10:06 flarco

We have shared a sample dataset with [[email protected]]. Please check it.

pawan-chauhan-9560 avatar Jun 21 '24 08:06 pawan-chauhan-9560