idris-tparsec icon indicating copy to clipboard operation
idris-tparsec copied to clipboard

Bug in the string literal parser?

Open gallais opened this issue 3 years ago • 1 comments

While I was porting the JSON parser to Agda yesterday I think I have found a bug in the string literal parser:

https://github.com/gallais/idris-tparsec/blob/75b288719b9781273a595c294b7d3bed5ea1904d/src/TParsec/Combinators/Chars.idr#L154-L156

After having read a \ we will only accept a u followed by 4 hexadecimal digits aka. a unicode character. Shouldn't we also accept:

  • another \
  • a "
  • a r, n, or t ?

gallais avatar Jun 22 '21 09:06 gallais

Ok this is pretty sneaky: andoptbind anyTok $ \ c => ... means we'll consume any token and succeed even if the continuation fails (in which case we return (c, Nothing)). This means that we accept any character following a \, and only have a special behaviour if it happens to be u aka the start of a unicode character.

Not sure how I feel about that.

gallais avatar Jun 22 '21 21:06 gallais