cassava icon indicating copy to clipboard operation
cassava copied to clipboard

Broken parser error messages

Open agrafix opened this issue 8 years ago • 1 comments

Currently, decode returns FromRecord a => Either String a. This is suboptimal, because some error messages include the row that failed - and if that row is utf-8 encoded and contains japanese for example, the error message becomes an unreadable mess of characters. This is the bad code:

https://github.com/hvr/cassava/blob/545b86d60276c51ec29681f1917f1e5fb9b67c54/Data/Csv/Encoding.hs#L340-L343 (Specifically BL8.unpack)

Resolution options:

  • As we already assume that the input is utf-8 (seeText FromField instance), instead of BL8.unpack we could try to decode utf8 first and if that fails fall back to BL8.unpack
  • The left Either part should be a byte string

What do you think?

agrafix avatar Feb 10 '17 15:02 agrafix

I think we should instead have Either (String, ByteString) a: return both the error message and the corresponding raw Field/Row that caused the issue.

ivan-m avatar Jul 05 '17 11:07 ivan-m