jackson-dataformat-csv icon indicating copy to clipboard operation
jackson-dataformat-csv copied to clipboard

Unexpected character causes exception for hasNext (2.8.3)

Open michaelkrog opened this issue 7 years ago • 1 comments

I have a 3rd party CSV file Im trying to import. Their quotes are not escaped so I have data like this:

124785285,"PLACE","","Pindsvineplejerne, Dyreværnsforening Af 03 Januar 2016".","","","","22334455","OKMO","PHONE_NORMAL_MOBILE","0",1,0,"185","01305",12,"Diesen Alle","",,"","1234","Andeby","","","","","","","08-06-2017 00:00:00",1561

(Notice the 4th column which has the syntax "{text}".")

Parsing this causes particular line causes an exception which is hard to recover from because its thrown when asking is iterator has more elements.

Stacktrace:

Caused by: com.fasterxml.jackson.core.JsonParseException: Unexpected character ('7' (code 55)): Expected separator ('"' (code 34)) or end-of-line
 at [Source: com.fasterxml.jackson.dataformat.csv.impl.UTF8Reader@741833c; line: 2716427, column: 92]
	at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1702)
	at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:558)
	at com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:456)
	at com.fasterxml.jackson.dataformat.csv.CsvParser._reportUnexpectedCsvChar(CsvParser.java:1089)
	at com.fasterxml.jackson.dataformat.csv.impl.CsvDecoder._nextQuotedString(CsvDecoder.java:838)
	at com.fasterxml.jackson.dataformat.csv.impl.CsvDecoder.nextString(CsvDecoder.java:601)
	at com.fasterxml.jackson.dataformat.csv.CsvParser._skipUntilEndOfLine(CsvParser.java:916)
	at com.fasterxml.jackson.dataformat.csv.CsvParser.nextToken(CsvParser.java:532)
	at com.fasterxml.jackson.databind.MappingIterator._resync(MappingIterator.java:391)
	at com.fasterxml.jackson.databind.MappingIterator.hasNextValue(MappingIterator.java:235)
	at com.fasterxml.jackson.databind.MappingIterator.hasNext(MappingIterator.java:180)
	... 56 common frames omitted

michaelkrog avatar Jun 22 '17 14:06 michaelkrog

Would it be possible to wrap this in a unit test, and see that skipping is still failing with 2.8.9? There have been some improvements in patch versions. It should be possible to recover from this problem I think, ideally only losing rest of the line (although in some cases sync may only occur with more data, losing next line too).

Exception msg looks odd too; perhaps sample line and exception are not from same run?

cowtowncoder avatar Jun 22 '17 21:06 cowtowncoder