Slower than serde_json in --release, much slower than serde_json in --debug
I found that json5 was taking many minutes to parse a single 180KB json file, which caused me to write up a benchmark-ish app which deserializes the following struct:
#[derive(Deserialize)]
struct MyStruct {
items: Vec<String>,
}
with the items being 100-char strings, varying in number from 10 to 1,000,000, using both serde_json and json5.
With cargo run --release, I get the following results:
Final results (MiB/second):
size serde_json json5 difference
========== ========== ========== ==========
10 => 116.835 7.806 14.967x
100 => 499.105 13.894 35.923x
1000 => 520.992 12.013 43.367x
10000 => 472.506 11.575 40.822x
100000 => 490.463 12.185 40.251x
1000000 => 408.434 11.058 36.935x
But when running in --debug, it's much worse:
Final results (MiB/second):
size serde_json json5 difference
========== ========== ========== ==========
10 => 20.340 0.355 57.347x
100 => 57.123 0.340 167.898x
1000 => 68.129 0.343 198.836x
10000 => 63.127 0.368 171.572x
100000 => 67.684 0.371 182.643x
500000 => 67.894 0.371 183.127x
And with more complicated structures like the following, the disparity in --debug is much, much greater (such as an hour vs. seconds to parse the same file):
struct MyStruct {
items_a: Vec<String>,
items_b: Vec<ItemB>,
field_c: String,
field_d: String,
field_e: String,
}
struct ItemB {
field_1: String,
field_2: u64,
field_3: u64
}
times are from a 2015 MacBook Pro, running rustc 1.52.1 (9bc8c42bb 2021-05-09)
I hate to be that guy bringing up a new/competing library in response to an issue elsewhere... but for whatever it's worth, I'm working on a new json5 library, json-five, and while I've not made any particular focuses on performance, its benchmarks seem promising so far.
I ran your bench testing my crate against this one and got the following results:
In release:
Final results (MiB/second):
size json-five json5 difference
========== ========== ========== ==========
10 => 22.409 5.250 4.269x
100 => 153.555 9.427 16.289x
1000 => 179.701 11.387 15.782x
10000 => 158.379 12.656 12.514x
100000 => 171.460 10.844 15.811x
1000000 => 171.457 11.442 14.985x
In debug:
Final results (MiB/second):
size json-five json5 difference
========== ========== ========== ==========
10 => 8.476 0.889 9.532x
100 => 17.542 1.122 15.628x
1000 => 17.736 1.181 15.017x
10000 => 17.878 1.181 15.133x
100000 => 18.093 1.166 15.512x
1000000 => 17.951 1.157 15.518x
Still slower than serde_json by comparison, but that's probably expected for a few reasons.
As a possible hint to the performance problems, I've also observed json5 uses excessive amounts of memory (like 10GB+ in your tests) while serde_json uses nowhere near that much.