crystal
crystal copied to clipboard
Optimize JSON parsing a bit
I saw this thread, so...
Two things here:
- Read string/IO byte per byte instead of char per char
- When trying to parse a string in the IO case, try to use the peek buffer if there's one
I used this Ruby/Crystal file to generate a big JSON:
json = %({"foo": 1, "bar": 2, "hello": "this will have an escape in it\\\"oh well", "banana": true, "array": [1, 2, 3, "a string that is a bit long"], "hash": {"a": 1, "b": 2, "c": 3}})
file = "[" + ([json] * 100000).join(",") + "]"
File.write("json.json", file)
Here's the benchmark:
require "json"
require "benchmark"
file_io = File.open("json.json")
file_string = File.read("json.json")
Benchmark.ips do |x|
x.report("JSON.parse (string)") do
JSON.parse(file_string)
end
x.report("JSON.parse (IO)") do
file_io.pos = 0
JSON.parse(file_io)
end
end
Before:
JSON.parse (string) 8.41 (118.97ms) (± 0.86%) 117MB/op fastest
JSON.parse (IO) 3.64 (274.72ms) (± 1.99%) 117MB/op 2.31× slower
After:
JSON.parse (string) 11.98 ( 83.50ms) (± 7.70%) 117MB/op fastest
JSON.parse (IO) 9.06 (110.41ms) (± 6.26%) 117MB/op 1.32× slower
I think the difference between before and after will be bigger if there are more strings to parse.
A note for the forum thread: I tried parsing that same big file with Ruby 3.1 and Ruby was (slightly) slower: 16 seconds in Crystal vs. 19 seconds in Ruby. This is on a Mac. So I don't know why it was slower in Crystal for OP (maybe in Linux it's different?)
Regarding memory: Crystal requires 117MB to load that entire data into memory. But in Ruby it's the same. So I'm not sure how memory can be further optimized...
Please review carefully! I think tests for when the peek buffer is incomplete or unavailable might not exist right now. Feel free to continue working on top of this PR (pushing commits to this branch or creating another PR from this code).
I found another optimization that improved both speed of parsing and the memory it allocates.
This PR made parsing slower by always allocating a string to parse integers and floats from. It also made it consume more memory. I was concerned about it at that time exactly because of that, but I agreed that correctness is more important than performance. However, we can have both! For "small" integers (less than 19 digits, never floats) we can compute the int value and always know that we are doing it right.
Before the last commit:
JSON.parse (string) 10.76 ( 92.92ms) (± 8.00%) 125MB/op fastest
JSON.parse (IO) 7.51 (133.21ms) (± 5.99%) 125MB/op 1.43× slower
After the last commit:
JSON.parse (string) 12.09 ( 82.69ms) (± 1.28%) 105MB/op fastest
JSON.parse (IO) 8.06 (124.08ms) (± 2.37%) 105MB/op 1.50× slower
So a bit faster and 20MB less, so about 16% percent less in this case.
But the memory allocated here depends on the amount of numbers and how long they are. For example, using this file:
json = "[1234234, 2982374, 3982734, 49827344, 592834, 65825, 723498, 82348, 9239847324, 1082348, 1123498, 122348, 132348, 142347, 152348, 16234283]"
file = "[" + ([json] * 10000).join(",") + "]"
File.write("json.json", file)
running the benchmark, before this PR:
JSON.parse (string) 95.86 ( 10.43ms) (± 1.44%) 13.9MB/op fastest
JSON.parse (IO) 42.03 ( 23.79ms) (± 3.17%) 13.9MB/op 2.28× slower
after this PR:
JSON.parse (string) 205.35 ( 4.87ms) (± 1.11%) 9.06MB/op fastest
JSON.parse (IO) 76.83 ( 13.02ms) (± 1.18%) 9.06MB/op 2.67× slower
so 5 MB less than before, from a total of 14MB, that's about 30% less memory!
This pull request has been mentioned on Crystal Forum. There might be relevant details there:
https://forum.crystal-lang.org/t/performance-issues-with-the-json-parser/6678/17
I think one of the "peek" branches is wrong. I'll see if I can merge it. I don't have a test to reproduce it yet, just one scenario I ran into.
A note for the forum thread: I tried parsing that same big file with Ruby 3.1 and Ruby was (slightly) slower: 16 seconds in Crystal vs. 19 seconds in Ruby. This is on a Mac. So I don't know why it was slower in Crystal for OP
Let's make the JSON data contain some long strings. They are Base64-encoded in my case, but that doesn't matter.
gen-json.rb
:
json = %|{"a_base64":#{("a" * 5000).inspect},"b_base64":#{("a" * 10000).inspect}}|
json = "[" + ([json] * 2500).join(",") + "]"
File.write("json.json", json)
benchmark-pk.cr
:
require "json"
require "benchmark"
file_io = File.open("json.json")
file_string = File.read("json.json")
Benchmark.bm do |x|
x.report("JSON.parse (string)") do
JSON.parse(file_string)
end
end
benchmark-pk.rb
:
require "json"
require "benchmark"
file_io = File.open("json.json")
str = File.read("json.json")
Benchmark.bm(19) do |x|
x.report("JSON.parse(str)") do
JSON.parse(str)
end
end
$ crystal build --release benchmark-pk.cr
$ /usr/bin/time -l ./benchmark-pk
user system total real
JSON.parse (string) 0.174718 0.005743 0.180461 ( 0.180668)
0,20 real 0,18 user 0,02 sys
94814208 maximum resident set size
0 average shared memory size
0 average unshared data size
0 average unshared stack size
5877 page reclaims
0 page faults
0 swaps
0 block input operations
0 block output operations
0 messages sent
0 messages received
0 signals received
0 voluntary context switches
39 involuntary context switches
1868924524 instructions retired
591412605 cycles elapsed
93389632 peak memory footprint
That's before your changes, on Apple aarch64/arm64.
$ /usr/bin/time -l ruby ./benchmark-pk.rb
user system total real
JSON.parse(str) 0.063432 0.003553 0.066985 ( 0.067092)
0,12 real 0,10 user 0,01 sys
90554368 maximum resident set size
0 average shared memory size
0 average unshared data size
0 average unshared stack size
5642 page reclaims
0 page faults
0 swaps
0 block input operations
0 block output operations
0 messages sent
0 messages received
0 signals received
0 voluntary context switches
43 involuntary context switches
1390183537 instructions retired
367126927 cycles elapsed
86574272 peak memory footprint
That's using Ruby 3.3.0. The Ruby version is faster by a factor of 0.181/0.067 = 2.70.
This pull request has been mentioned on Crystal Forum. There might be relevant details there:
https://forum.crystal-lang.org/t/performance-issues-with-the-json-parser/6678/20
Starting with Ruby 3.3 you can enable YJIT at runtime. It would be nice to see the w/wo YJIT times for this.
I now put the following Ruby snippet at the top of Ruby files to automatically enable YJIT when using Ruby >=3.3.
# Enable YJIT if using CRuby >= 3.3"
RubyVM::YJIT.enable if RUBY_ENGINE == 'ruby' and RUBY_VERSION.to_f >= 3.3
Starting with Ruby 3.3 you can enable YJIT at runtime. It would be nice to see the w/wo YJIT times for this.
I get the same times with or without YJIT.
And just for the record, here's my benchmark in Ruby with Oj:
require "benchmark"
require "json"
require "oj"
file_io = File.open("json.json")
str = File.read("json.json")
Benchmark.bm(19) do |x|
x.report("JSON.parse(str)") do
JSON.parse(str)
end
x.report("Oj.load(str)") do
Oj.load(str)
end
end
user system total real
JSON.parse(str) 0.063371 0.003690 0.067061 ( 0.067182)
Oj.load(str) 0.028192 0.003248 0.031440 ( 0.031523)
i.e. Oj is faster than the Crystal version by a factor of 0.181/0.032 = 5.66.
There are more improvements to be made here. I'd like to send them one by one in small PRs. If I put them all together here chances of this being merged are very small.
That said, I don't think we'll reach the level of optimization of Oj. The C code is pretty hand-crafted. We could try to do the same but that file has copyright...
This pull request has been mentioned on Crystal Forum. There might be relevant details there:
https://forum.crystal-lang.org/t/performance-issues-with-the-json-parser/6678/25
This pull request has been mentioned on Crystal Forum. There might be relevant details there:
https://forum.crystal-lang.org/t/stringpool-make-the-hash-key-part-of-the-public-api/6766/5