DataflowTemplates
DataflowTemplates copied to clipboard
[Bug]: the csv resolving bug on CSVToBigQuery template
Related Template(s)
CSVToBigQuery
Template Version
2024-07-16-00_rc00
What happened?
the csv file resolving will encounter error: "Number of fields in the schema and number of Csv headers do not match."
when csv file fieds has comma in text, for example ,if a we have two field ,field2 is a json string
{field1},{field2}
field1Text,"{""key1"":"value1"",""key2"":"value2""}"
the spiliter will split it to 3 column , which number is not match with the csv header and bq schema, and this row will transform failed.
the problem probably in line 199 of CSVToBigQuery.java
Splitter.on(delimiter.get()).splitToList(context.element()).toArray(new String[0]);
please consider use csv utils package to fix this bad case
Relevant log output
Number of fields in the schema and number of Csv headers do not match.