upickle icon indicating copy to clipboard operation
upickle copied to clipboard

NumberFormatException when deserializing a 128-bit integer

Open rwalbergpd opened this issue 7 years ago • 9 comments

Using "com.lihaoyi" %% "ujson" % "0.6.6"

val json = ujson.read("{'foo': 28291041994989161181883157038606177750}")

yields:

28291041994989161181883157038606177750
java.lang.NumberFormatException: 28291041994989161181883157038606177750
	at ujson.util.Util$.parseLong(Util.scala:57)
	at ujson.util.Util$.parseIntegralNum(Util.scala:23)
	at ujson.Js$.visitNum(Js.scala:181)
	at ujson.Js$.visitNum(Js.scala:77)
	at ujson.Parser.parseNum(Parser.scala:171)
	at ujson.Parser.rparse(Parser.scala:404)
	at ujson.Parser.parse(Parser.scala:330)
	at ujson.Parser.parse(Parser.scala:325)
	at ujson.SyncParser.parse(SyncParser.scala:23)
	at ujson.StringParser$.transform(StringParser.scala:28)
	at ujson.StringParser$.transform(StringParser.scala:27)
	at ujson.Transformable$fromTransformer.transform(Transformable.scala:13)
	at ujson.package$.transform(package.scala:2)
	at ujson.package$.read(package.scala:4)

rwalbergpd avatar Aug 17 '18 02:08 rwalbergpd

Looks like this just affects ujson's AST. Workaround: the http://www.lihaoyi.com/upickle/#OtherASTs probably handle it properly. I know play-json represents all numbers as BigDecimal.

https://tools.ietf.org/html/rfc8259#section-6 and https://tools.ietf.org/html/rfc8259#section-9 indicate that parsers may reject large numbers, although all other implementations on JSONTestSuite handle them. This limitation would be surprising.

Out of curiosity, I ran ujson through https://github.com/nst/JSONTestSuite/commit/5b47b778b9a2b87007fe6500b6c919737c46008d. Here are the results (excluding expected passes/rejections):

jsontestsuite

htmldoug avatar Aug 22 '18 06:08 htmldoug

Interesting! Thanks for the great explanation. I can close this issue since there's a reasonable workaround.

rwalbergpd avatar Aug 22 '18 15:08 rwalbergpd

Would be nice for @lihaoyi to weigh in. If this limitation is intentional, seems to me like it should be documented somewhere.

htmldoug avatar Aug 22 '18 15:08 htmldoug

It seems to me that the ujson AST is not correct. Would it be possible to fix it rather than suggesting to switch to an alternative AST?

julienrf avatar Nov 22 '19 09:11 julienrf

This seems like a parsing issue (although the logic does live on the AST companion object) nothing to do with the AST as a data structure: the datatype uses doubles so it should be able to hold these values (at some loss of precision)

lihaoyi avatar Nov 22 '19 13:11 lihaoyi

I might be wrong but it seems to me that the only way to represent JSON numbers in the AST is to use ujson.Num, which only supports Double values. Am I missing something?

Edit: I see now that my issue is different: it’s about the fact that the number representation is lossy. Should I open a different issue? Has this been already discussed somewhere?

julienrf avatar Nov 22 '19 14:11 julienrf

The uJson AST is designed to mimic the Javascript AST, which takes only numbers. If you want something else, use a different AST (or don’t use an AST, like uPickle)

lihaoyi avatar Nov 22 '19 14:11 lihaoyi

Note that unlike other JSON libraries, ujson.Value isn’t blessed in any way, so if you use a uJson with a different AST you need not worry about ujson.Value tradeoffs at all. In fact, internally uPickle defines its own IndexedValue AST for cases where we need the source indices and non-lossy temporary storage of the incoming JSON, and in most cases does not use an AST at all. So you don't need to feel bad about defining your own AST for your own use case

lihaoyi avatar Nov 22 '19 14:11 lihaoyi

Thanks for the clarification!

julienrf avatar Nov 22 '19 14:11 julienrf