graphql-ruby icon indicating copy to clipboard operation
graphql-ruby copied to clipboard

Possible performance regression in 2.5.x

Open Nuzair46 opened this issue 6 months ago • 4 comments

I was upgrading the gem from 2.0.26 to latest. I fixed deprecations till 2.4.16, but when I try 2.5.x, some parallel in my CI takes too long to finish and exceeds the timeout. There are no failing tests. Im wondering if there is any performance regression.

in 2.5.5

2152 examples, 0 failures

Took 1171 seconds (19:31)
Tests Failed

Exited with code exit status 1

another re-run gave me this in 2.5.7

Too long with no output (exceeded 10m0s): context deadline exceeded

This could very well be an issue with my setup, but Im not able to identify any root cause as everything seems to work until 2.4.16

I use validate_timeout nil in my schemas to avoid timeout in some of my validations to match the previous state of the app.

Any insight on what could be happening here?

Nuzair46 avatar May 30 '25 05:05 Nuzair46

I think I was able to pin it down on schema dump. I had this particular schema with 68000 lines in its json dump. While testing this, i think its slower on 2.5.x compared to 2.4.x which causes the timeout.

Nuzair46 avatar May 30 '25 06:05 Nuzair46

Hey, I'd love to get this regression fixed in 2.5.x. Could you help me track it down?

The best thing would be some outputs from Ruby profilers. There are lots, and output from any of them would be helpful, so feel free to use your favorites. If you're looking for suggestions, my favorites are Stackprof (https://graphql-ruby.org/testing/profiling.html#stackprof) and MemoryProfiler (https://graphql-ruby.org/testing/profiling.html#memoryprofiler). The examples in those docs use MySchema.execute(...) inside the profiler's do ... end block, but in this case, you'd put your JSON-ification code in there (maybe MySchema.to_json, or something else -- whatever was slow in your test suite!).

Getting profiles of the JSON dump on 2.5.x would be necessary to hunt down this issue. If you're also able to share the result on 2.4.x, that would be really helpful too, but not essential.

If you're not able to get profile output, the next best thing would be to get an SDL printout of your schema. If the schema isn't public, you could anonymize it using the script here: https://gist.github.com/myronmarston/4feeeaa5472096d3c1a09eea8d983046

Let me know what you think!

rmosolgo avatar May 30 '25 15:05 rmosolgo

Hey @rmosolgo thanks for getting back.

I will not be able to provide any profiled data as Im working on a closed source. But I will attach the anonymised schema SDL

anon_guest_schema.txt

The number of lines are lower than before likely due to presence of descriptions in the actual schema

Nuzair46 avatar Jun 04 '25 09:06 Nuzair46

Thanks for sharing that, @Nuzair46. I wrote a small benchmark script and ran it on 2.4.16 and 2.5.7, but it didn't show any difference in speed:

Benchmark script and results

require "bundler/inline"

gemfile do
  gem "graphql", ENV["GRAPHQL_VERSION"]
  gem "benchmark-ips"
end

puts "GraphQL-Ruby v#{GraphQL::VERSION} (#{ENV["GRAPHQL_VERSION"]})"
schema = GraphQL::Schema.from_definition(File.read("non_guest_schema.txt"))


Benchmark.ips do |x|
  x.report("Dump") { schema.to_definition }
end

2.4.16:

~/code/graphql-ruby $ GRAPHQL_VERSION=2.4.16 ruby schema_dump_bench.rb
GraphQL-Ruby v2.4.16 (2.4.16)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-darwin22]
Warming up --------------------------------------
                Dump     3.000 i/100ms
Calculating -------------------------------------
                Dump     35.074 (± 8.6%) i/s   (28.51 ms/i) -    174.000 in   5.016008s

2.5.7:

~/code/graphql-ruby $ GRAPHQL_VERSION=2.5.7 ruby schema_dump_bench.rb
GraphQL-Ruby v2.5.7 (2.5.7)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-darwin22]
Warming up --------------------------------------
                Dump     3.000 i/100ms
Calculating -------------------------------------
                Dump     35.440 (± 8.5%) i/s   (28.22 ms/i) -    177.000 in   5.039998s

However, attempting to parse that file did raise some errors, like this one:

 Type64 (#<Class:0x0000000126789750>) cannot be implemented since it's not a GraphQL Interface. Use `include` for plain Ruby modules. (RuntimeError)

In that file, Type64 is used as an interface:

type Type89 implements Type64 & Type166 {

but defined as an object, not an interface:

type Type64 {

The same thing is true for Type27, Type115, Type126, Type139, Type376, Type166, Type87, Type322, Type200.

(Actually, the error messages I got were not helpful enough to debug this problem, so I'm improving them in #5372.)

So, I can't track down this performance issue from the SDL alone... but:

  • You could try addressing those object types used as interfaces. That might be causing some unexpected scenarios in schema dump. (It's also invalid GraphQL ... sorry GraphQL-Ruby didn't catch it before!)
  • If you can produce profiling outputs, then filter them to only include lines in the GraphQL gem, that would still be really helpful. You could use MemoryProfiler's allow_files option and filter Stackprof output with stackprof my-profile.dump --method "GraphQL" to show only GraphQL-related calls. (If your app uses any constants named GraphQL, you might have to remove those by hand.)

If you're able to pursue either of those options, please let me know what you find!

rmosolgo avatar Jun 04 '25 12:06 rmosolgo

👋 just checking in if you're able to provide either of those profiling outputs. I'd love to make this work right!

rmosolgo avatar Jun 17 '25 10:06 rmosolgo

Hey, I'd love to keep hunting this down but I don't have enough information yet. If anyone else runs into this, please open a new issue with some of the debug information described above!

rmosolgo avatar Jul 09 '25 13:07 rmosolgo