SimpleFlatMapper
SimpleFlatMapper copied to clipboard
CSV parser does something strange on invalid data
Sample file:
-63.00,"EUR","NC ",""
-70.00,"EUR","NC ","Issue Receipt "Witho"
-45.00,"EUR","NC ",""
Result:
"-63.00", "EUR", "NC", ""
"-70.00", "EUR", "NC", "Issue Receipt Witho
-45.00,EUR,NC ,""
Single quote marks are not allowed, so it either should throw or attempt to parse, but currently it just silently appends remaining part of the file to the corrupted column.
because of the missing quote the cell covers "Issue Receipt Witho -45.00,EUR,NC ,"" as "" is an escaped quote.
I'm guessing we could add a strict mode. that would check at the end of the parse the state and make sure it's not inside open quotes
@arnaudroger, strict mode would be fine, CSV is not strict format itself but probably we could add at least some useful restrictions to avoid situations like this.