csvlint icon indicating copy to clipboard operation
csvlint copied to clipboard

UTF-8 CSV files with BOM aren't parsed correctly if the first header field contains quotes

Open datatraveller1 opened this issue 1 year ago • 1 comments

I have a CSV file encoded with UTF8-BOM:

"first_column","second_column"
"Hello","how are you"

This is a correct CSV file but there is the result:

Record #0 has error: bare " in non-quoted-field

The issue happens with an UTF-8 with BOM encoded file if the first header field is surrounded by quotes.

Suggestion: This could be solved by removing the UTF-8 BOM in the header line: Pseudocode: if (line_number == 1) { sub(/^\xef\xbb\xbf/, "", line) }

datatraveller1 avatar Sep 15 '23 20:09 datatraveller1

I have just noticed someone else posted nearly the same issue (https://github.com/Clever/csvlint/issues/21) but this simple fix (removing the UTF-8 BOM in your csvlint code) would help to succeed the csvlint check.

datatraveller1 avatar Sep 15 '23 20:09 datatraveller1