scala-commons icon indicating copy to clipboard operation
scala-commons copied to clipboard

Possible DoS when parsing a JSON number to BigInt or BigDecimal

Open plokhotnyuk opened this issue 7 years ago • 5 comments

It happened that parsing of BigInt and BigDecimal in latest versions of JVM has O(n^2) complexity where n is the number of significant digits. It means that a JSON body with a length ~1Mb can do 100% load one CPU core for several seconds:

scala> def timed[A](f: => A): A = { val t = System.currentTimeMillis(); val r = f; println(s"Took ${System.currentTimeMillis() - t} millis"); r } 
timed: [A](f: => A)A

scala> List(1000, 10000, 100000, 1000000).foreach(x => timed(BigInt("9" * x)))
Took 0 millis
Took 2 millis
Took 135 millis
Took 13221 millis

scala> List(1000, 10000, 100000, 1000000).foreach(x => timed(BigDecimal("9" * x)))
Took 0 millis
Took 2 millis
Took 138 millis
Took 13440 millis

plokhotnyuk avatar Oct 08 '18 11:10 plokhotnyuk

Also, BigDecimal has another vulnerability, because by default an unlimited math context is used during parsing and subsequent user calculations (like '+', '%', 'longValue', etc.) reuse it that can lead to 100% CPU for several minutes with subsequent out of memory error for just one short value like "9e999999999".

Moreover, for Scala versions before 2.12.7 the math context for some operations is ignored that can lead for the same problem: https://github.com/scala/bug/issues/10882

plokhotnyuk avatar Oct 08 '18 11:10 plokhotnyuk

So, are you suggesting we should implement our own parsing of BigDecimal :confused: ? Aren't the JVM guys doing something about this?

JsonStringInput no longer uses unlimited math context for parsing, see JsonOptions.

ghik avatar Oct 08 '18 12:10 ghik

So, are you suggesting we should implement our own parsing of BigDecimal confused ?

Yeh, I'm already thinking about it...

Aren't the JVM guys doing something about this?

You are right that problems should be fixed in Java and Scala sources but I'm not sure that it can happen in the near future.

JsonStringInput no longer uses unlimited math context for parsing, see JsonOptions.

The provided math context isn't taken in account during parsing of the mantissa:

scala> List(1000, 10000, 100000, 1000000).foreach(x => timed(BigDecimal("9" * x, java.math.MathContext.DECIMAL128)))
Took 0 millis
Took 2 millis
Took 149 millis
Took 14106 millis

plokhotnyuk avatar Oct 08 '18 12:10 plokhotnyuk

BTW, the latest version of jsoniter-scala scans big numbers before parsing and checks if no any of configured limits is exceeded:

https://github.com/plokhotnyuk/jsoniter-scala/blob/master/jsoniter-scala-macros/src/main/scala/com/github/plokhotnyuk/jsoniter_scala/macros/JsonCodecMaker.scala#L66-L68

plokhotnyuk avatar Oct 08 '18 12:10 plokhotnyuk

Finally, I managed to get much more efficient implementation for parsing of strings to BigInt and BigDecimal values.

For small numbers it gives more than 2x speed up, while big numbers with 1M digits can be parsed in ~50x times faster than with standard constructors from JDK 8/11. But it still does not close possibility for DoS contemporary servers at 100Mbit/1Gbit rate of input.

plokhotnyuk avatar Mar 19 '19 12:03 plokhotnyuk