Single quote in a csv field has different behaviour from clojure.data.csv and python
(csv/read-csv "a,3\"\nb,4\"\nc,5")
;; => (["a" "3\""] ["b" "4\""] ["c" "5"])
(charred/read-csv "a,3\"\nb,4\"\nc,5")
;; => (["a" "3\nb,4"] ["c" "5"])
I find it hard to say which is "right". I encountered this in a dataset that uses quotes as a unit for inches. In this case both data.csv and pythons csv library does the correct thing while charred collapses one row. So for compatibility I would prefer charred to to the same.
The example input used:
a,3"
b,4"
c,5
Without the last row, I get an exception with charred.
The way I think the other mentioned parses work, is that a double-quote that is not at the beginning of a field is not considered to be quoting.
See this python example:
import csv
with open("test.csv", 'r+') as h:
r = csv.reader(h)
for i in r:
print(i)
Input:
a,3"
b,4"
c,"Another
line"
Output:
['a', '3"']
['b', '4"']
['c', 'Another\nline']
It looks to me like the quotes are escaped which should be totally fine. Agreed charred does the wrong thing here.
Fixed in release 1.012.
Thanks for the issue, btw. Your analysis was spot on.
It was actually something I found when reading the data.csv source code was uncertain about wanted to check what more csv parsers did before implementing it as it is kind of a narrowing definition.