truffleruby
truffleruby copied to clipboard
Data read with Zlib::GzipReader (but not copy of same data) slow to parse with JSON.parse
(first issue opened here; if I'm off the mark or in the wrong spot, my apologies and please let me know)
JSON-parsing data read from a gzipped file with Zlib::GzipReader is very slow. Parsing the same byte-for-byte data read from a non-gzipped file is not slow. Parsing a copy of the data read from a gzipped file is also not slow.
This gist gives a small program that creates a bunch of arrays-of-arrays and dumps json output into two files, one plain and one gzipped. It then reads those data back in via File.read or Zlib::GzipReader, additionally making a copy of the formerly-gzipped-data.
Note that I'm not parsing every time -- I'm just getting a copy of the data (as seen in this snippet from the gist):
plain_data = File.read(PLAIN)
previously_gzipped_data = Zlib::GzipReader.open(GZIP) do |f|
f.read
end
forced_copy = previously_gzipped_data + " "
Oj doesn't show this same issue. I know benchmarking this stuff is treacherous, but the pattern seems pretty consistent.
truffleruby 20.3.0, like ruby 2.6.6, GraalVM CE JVM [x86_64-darwin]
Testing with an array of 1000 20-element arrays
BEGIN STDLIB JSON
user system total real
Plain 6.761722 0.218166 6.979888 ( 1.597190)
Previously gzipped 15.290809 1.759818 17.050627 ( 5.812477)
Gzipped/forced copy 0.112019 0.052004 0.164023 ( 0.069019)
Plain 0.060666 0.040497 0.101163 ( 0.061761)
Previously gzipped 4.385712 1.001380 5.387092 ( 4.677956)
Gzipped/forced copy 0.135513 0.002502 0.138015 ( 0.062301)
Plain 0.126698 0.004074 0.130772 ( 0.067788)
Previously gzipped 4.375435 0.766456 5.141891 ( 4.427382)
Gzipped/forced copy 0.055923 0.000771 0.056694 ( 0.057266)
BEGIN Oj
user system total real
Plain 8.310956 0.346334 8.657290 ( 1.903706)
Previously gzipped 0.849619 0.016991 0.866610 ( 0.225177)
Gzipped/forced copy 0.753260 0.026104 0.779364 ( 0.228311)
Plain 0.540689 0.011607 0.552296 ( 0.206072)
Previously gzipped 0.417603 0.008346 0.425949 ( 0.195774)
Gzipped/forced copy 0.396726 0.013893 0.410619 ( 0.206867)
Plain 0.400436 0.010877 0.411313 ( 0.218068)
Previously gzipped 0.421178 0.009404 0.430582 ( 0.204174)
Gzipped/forced copy 0.34264
My original data was just a file on disk as gzipped with gzip, so I don't think this has anything to do with Zlib::GzipWriter -- that's just to make it a self-contained bug reproduction. I'm assuming that GzipReader is hanging on to the string in some way that makes things nasty?
-Bill-
Can you try putting Truffle::Ropes.flatten_rope(data) on your data after you get it from gzip and before you use it?
Adding previously_gzipped_data = Truffle::Ropes.flatten_rope(previously_gzipped_data) does the trick (although it doesn't help me for the moment since the calls are locked up in a gem). Hopefully it'll help point you in the right direction.
Thank you for the great report, it looks like pure-Ruby JSON parsing is slower with the String from Zlib::GzipReader.
Truffle::Ropes.flatten_rope() seems a good hint that this is an issues with Ropes.
Of course, that's just helpful to debug, it should work fine without any explicit call (Truffle::Ropes.flatten_rope is a private debug-like API, just like everything else under Truffle::).
Maybe it has many ConcatRope and the reading accesses in JSON don't force to materialize the byte[]?
Oj likely uses RSTRING_PTR() and that forces to flatten the Rope.
@norswap Could you investigate?
@billdueber To make sure the pure-Ruby version of json, from stdlib, is used, could you show gem list | grep json on TruffleRuby? (it should be json (default: 2.3.0))
If the json gem is installed, then it will be used as a C extension and preferred to the pure-Ruby version, which is unfortunately rather confusing but that's the semantics of default gems.
It's simpler than that, actually:
p Truffle::CExt.native_string?(previously_gzipped_data) # => true
forced_copy = previously_gzipped_data + " "
p Truffle::CExt.native_string?(forced_copy) # => false
json = JSON.parse(previously_gzipped_data)
p Truffle::CExt.native_string?(previously_gzipped_data) # => true
So the issue when we use previously_gzipped_data, is that we use a NativeRope instead of a ManagedRope.
And I would expect that String migrates from NativeRope to ManagedRope only once, but that doesn't seem the case since it's still native even after JSON.parse(previously_gzipped_data).
Most likely involved, this force_encoding in JSON:
https://github.com/oracle/truffleruby/blob/e020c89a8368ef2e1b996904f7becf65b375a47e/lib/json/lib/json/pure/parser.rb#L140
previously_gzipped_data.encoding is UTF-8 (maybe a bit surprising? But MRI behaves the same).
And force_encoding actually returns a NativeRope when the source is a NativeRope.
The .encode above the force_encoding also returns a different String, so the original String is never modified and stays native.
Then this String is passed to StringScanner, and StringScanner#scan is used which will run Regexps against it.
Regexps use a RopeNodes.BytesNode, which for a NativeRope will allocate a byte[] and copy from native memory every time, which seems very likely to be the performance issue.
Maybe BytesNode should take a String and not a Rope, so it could transition the String to managed if it's native only once, and store the ManagedRope back in the String.
Also BytesNode seems a good indication of "read-heavy workload coming", so it seems a good place to do that.
JSON gem (almost) just as you expected -- I apparently have 2.1.0
✦ ❯ gem list | grep json
json (default: 2.1.0)
Right, that's expected for TruffleRuby 20.3.0, thanks for confirming.
Just pinging on this to see if it's on anyone's radar. I've updated the benchmarking code to explicitly assert that calling Truffle::Debug.flatten_string (nee Truffle::Ropes.flatten_rope) addresses the issue (it does, as expected).
Removed Oj because it make the "x slower" comparison so hard to read, but it's about 10x faster than JSON and doesn't suffer from gzip slowdown.
tl;dr is that on my hardware gzdata is about 30x slower on the smaller (array-of-100-arrays) json string, and about 400x slower on the larger json string.
truffleruby 22.3.0, like ruby 3.0.3, GraalVM CE Native [x86_64-darwin]
Base unit is an array of 20 integers
JSON-decode a string encoding an array of 100/1000 of those base units.
Warming up --------------------------------------
JSON 100 plain 19.000 i/100ms
JSON 100 gzdata 1.000 i/100ms
JSON 1000 plain 2.000 i/100ms
JSON 1000 gzdata 1.000 i/100ms
Calculating -------------------------------------
JSON 100 plain 722.230 (±15.2%) i/s - 7.030k in 10.011312s
JSON 100 gzdata 26.395 (±34.1%) i/s - 202.000 in 10.023596s
JSON 1000 plain 73.287 (±13.6%) i/s - 716.000 in 10.003276s
JSON 1000 gzdata 0.166 (± 0.0%) i/s - 2.000 in 12.601048s
Comparison:
JSON 100 plain: 722.2 i/s
JSON 1000 plain: 73.3 i/s - 9.85x (± 0.00) slower
JSON 100 gzdata: 26.4 i/s - 27.36x (± 0.00) slower
JSON 1000 gzdata: 0.2 i/s - 4355.93x (± 0.00) slower
Thank you, we should indeed look at this again, especially since the migration to TruffleString which should help with this. Could you share your new benchmark file? In the gist it seems to be the older benchmark (e.g. using oj and not using benchmark/ips), but the newer benchmark seems much simpler and uses benchmark/ips so I'd gladly use that to investigate. The older benchmark file might still be needed to generate the files though.
I threw up another gist with a less-organically-written benchmark script that strictly compares one set of plain data vs one set of gzipped data. https://gist.github.com/billdueber/87b4f100c4a2d5d1c756470e09615857
Sample output:
✦ ❯ ruby slow_gzip_bench.rb 1000 10 10
truffleruby 22.3.0, like ruby 3.0.3, GraalVM CE Native [x86_64-darwin]
Base unit is an array of 20 integers
Rows: 1000; Warmup seconds: 10; run test for 10 seconds
Warming up --------------------------------------
plain 7.000 i/100ms
gzdata 1.000 i/100ms
Calculating -------------------------------------
plain 92.525 (±10.8%) i/s - 924.000 in 10.134733s
gzdata 0.176 (± 0.0%) i/s - 2.000 in 11.384485s
Comparison:
plain: 92.5 i/s
gzdata: 0.2 i/s - 526.66x (± 0.00) slower
Thank you, I can reproduce it with truffleruby-dev.
And adding:
if IS_TRUFFLE
FILEDATA["gzdata"] = Truffle::Debug.flatten_string(FILEDATA["gzdata"])
end
makes them both the same speed.