json icon indicating copy to clipboard operation
json copied to clipboard

Make `JSON.generate` 1.75x as fast

Open mame opened this issue 1 year ago • 3 comments

This PR speeds up JSON.generate by approximately 1.75x (485k instructions per second -> 840k instructions per second) for the benchmark of Oj. This makes JSON.generate nearly as fast as Oj.dump.

Before:

$ ruby --yjit -Ilib -I ext oj-bench.rb
** Oj version 3.16.3 **
ruby 3.4.0dev (2023-12-27T05:30:20Z master 862cfcaf75) +YJIT [x86_64-linux]
Warming up --------------------------------------
             Oj.dump    83.089k i/100ms
    Oj.dump [compat]    72.410k i/100ms
     Oj.dump [rails]    57.698k i/100ms
       JSON.generate    49.706k i/100ms
Calculating -------------------------------------
             Oj.dump    836.635k (± 0.4%) i/s -     12.630M in  15.095866s
    Oj.dump [compat]    718.031k (± 0.2%) i/s -     10.789M in  15.026031s
     Oj.dump [rails]    573.882k (± 0.3%) i/s -      8.655M in  15.081085s
       JSON.generate    484.650k (± 1.0%) i/s -      7.307M in  15.078021s

After:

$ ruby --yjit -Ilib -I ext oj-bench.rb
** Oj version 3.16.3 **
ruby 3.4.0dev (2023-12-27T05:30:20Z master 862cfcaf75) +YJIT [x86_64-linux]
Warming up --------------------------------------
             Oj.dump    83.349k i/100ms
    Oj.dump [compat]    71.605k i/100ms
     Oj.dump [rails]    57.058k i/100ms
       JSON.generate    84.498k i/100ms
Calculating -------------------------------------
             Oj.dump    837.304k (± 0.3%) i/s -     12.586M in  15.031372s
    Oj.dump [compat]    718.657k (± 0.5%) i/s -     10.812M in  15.045573s
     Oj.dump [rails]    565.824k (± 0.3%) i/s -      8.502M in  15.025348s
       JSON.generate    839.614k (± 0.3%) i/s -     12.675M in  15.096044s

This PR consists of the following several improvements.

  • Drop prebuild of array_delim, etc.

    • array_delim is usually a single comma character. Using memcpy to copy a single character was inefficient.
    • It is much faster to output a comma and (optional) array_nl separately without prebuild.
    • This improved the speed by about 24%, from 480k i/s to 593k i/s.
  • Use faster Ruby API for encoding checks.

    • This improved the speed by 12%, from 593k i/s to 665k i/s.
  • Use a fast path when string escaping is not needed.

    • This improves the performance by 16%, from 665ki/s to 770k i/s.
  • Use faster Ruby API for dispatching the class of objects.

    • This improved performance by 5%, from 770k i/s to 806k i/s.
  • Use generate_json_string for object keys.

    • Since object keys are already verified to be String, using generate_json in general dispatch was an unnecessary overhead.
    • This improved the performance by 3%, from 806k i/s to 830k i/s.
  • Use faster Ruby API for reading array elements.

    • This improved the performance by about 4%, from 830k i/s to 854k i/s.

I made them into one PR because I thought separating this to multiple PRs would bring many conflicts between PRs. However, if you want me to do so, feel free to let me know.

mame avatar Dec 27 '23 09:12 mame

Note: I got oj-bench.rb from this article.

mame avatar Dec 27 '23 09:12 mame

I will re-run https://github.com/flori/json/actions/runs/7336797514/job/19976695584?pr=562 after supporting Ruby 3.3 at ruby/setup-ruby.

hsbt avatar Dec 27 '23 09:12 hsbt

I'd love to see this merged, @hsbt could you take another look now that Ruby 3.3 is properly released?

Earlopain avatar Feb 26 '24 13:02 Earlopain