DataflowTemplates icon indicating copy to clipboard operation
DataflowTemplates copied to clipboard

[Bug]: the csv resolving bug on CSVToBigQuery template

Open OpensourceHU opened this issue 7 months ago • 0 comments

Related Template(s)

CSVToBigQuery

Template Version

2024-07-16-00_rc00

What happened?

the csv file resolving will encounter error: "Number of fields in the schema and number of Csv headers do not match." when csv file fieds has comma in text, for example ,if a we have two field ,field2 is a json string {field1},{field2} field1Text,"{""key1"":"value1"",""key2"":"value2""}" the spiliter will split it to 3 column , which number is not match with the csv header and bq schema, and this row will transform failed. the problem probably in line 199 of CSVToBigQuery.java Splitter.on(delimiter.get()).splitToList(context.element()).toArray(new String[0]); please consider use csv utils package to fix this bad case

Relevant log output

Number of fields in the schema and number of Csv headers do not match.

OpensourceHU avatar Aug 01 '24 06:08 OpensourceHU