spray-json
spray-json copied to clipboard
Crash due to Unicode U+FFFD replacement character
When a JSON string has the U+FFFD replacement character in it, spray-json
crashes and can't parse it.
The byte sequence is EF BF BF
.
https://en.wikipedia.org/wiki/Specials_%28Unicode_block%29#Replacement_character
Check it out:
scala> import spray.json._
import spray.json._
scala> val bytes = Array(123, 34, 104, 101, 108, 108, 111, 34, 58, 34, 116, 104, 105, 115, 32, 0xEF, 0xBF, 0xBF, 32, 119, 111, 114, 108, 100, 34, 125).map(_.toByte)
bytes: Array[Byte] = Array(123, 34, 104, 101, 108, 108, 111, 34, 58, 34, 116, 104, 105, 115, 32, -17, -65, -65, 32, 119, 111, 114, 108, 100, 34, 125)
scala> val s = new String(bytes, "UTF-8")
s: String = {"hello":"this world"}
scala> s.parseJson
spray.json.JsonParser$ParsingException: Unexpected end-of-input at input index 15 (line 1, position 16), expected '"':
{"hello":"this
^
at spray.json.JsonParser.fail(JsonParser.scala:213)
at spray.json.JsonParser.require(JsonParser.scala:196)
at spray.json.JsonParser.string(JsonParser.scala:144)
at spray.json.JsonParser.value(JsonParser.scala:63)
at spray.json.JsonParser.members$1(JsonParser.scala:81)
at spray.json.JsonParser.object(JsonParser.scala:86)
at spray.json.JsonParser.value(JsonParser.scala:60)
at spray.json.JsonParser.parseJsValue(JsonParser.scala:43)
at spray.json.JsonParser$.apply(JsonParser.scala:28)
at spray.json.PimpedString.parseJson(package.scala:45)
... 43 elided
We're working around this by catching this exception and replacing the replacement character, and then re-parsing. Hopefully you see some humor in replacing the replacement. :-)
I haven't dug into spray-json
to see how troublesome it'd be to fix this, but please do. Thanks!
+1 Thanks for explaining how to work around it.