idris-tparsec
idris-tparsec copied to clipboard
Bug in the string literal parser?
While I was porting the JSON parser to Agda yesterday I think I have found a bug in the string literal parser:
https://github.com/gallais/idris-tparsec/blob/75b288719b9781273a595c294b7d3bed5ea1904d/src/TParsec/Combinators/Chars.idr#L154-L156
After having read a \
we will only accept a u
followed by 4 hexadecimal
digits aka. a unicode character. Shouldn't we also accept:
- another
\
- a
"
- a
r
,n
, ort
?
Ok this is pretty sneaky: andoptbind anyTok $ \ c => ...
means we'll consume
any token and succeed even if the continuation fails (in which case we return
(c, Nothing)
). This means that we accept any character following a \
, and
only have a special behaviour if it happens to be u
aka the start of a unicode
character.
Not sure how I feel about that.