csvlint
csvlint copied to clipboard
UTF-8 CSV files with BOM aren't parsed correctly if the first header field contains quotes
I have a CSV file encoded with UTF8-BOM:
"first_column","second_column"
"Hello","how are you"
This is a correct CSV file but there is the result:
Record #0 has error: bare " in non-quoted-field
The issue happens with an UTF-8 with BOM encoded file if the first header field is surrounded by quotes.
Suggestion: This could be solved by removing the UTF-8 BOM in the header line:
Pseudocode:
if (line_number == 1) { sub(/^\xef\xbb\xbf/, "", line) }
I have just noticed someone else posted nearly the same issue (https://github.com/Clever/csvlint/issues/21) but this simple fix (removing the UTF-8 BOM in your csvlint code) would help to succeed the csvlint check.