graphql-ruby
graphql-ruby copied to clipboard
Slow Performance With 1K Objects/15 Fields
Describe the bug
I have a GraphQL query that fetches "Connection" objects and about 15 fields on each. It looks like field resolution takes 4-8s for ~1000 objects:
Here's the full flamegraph: graphql-1k-flamegraph.html.zip
Is this expected? If I only fetch one field instead of 15, it takes about 1s. We don't have any N+1 queries and most of the fields are being loaded directly from our database with no additional processing. Thanks for your help.
Versions
graphql
version: 1.13.8
rails
(or other framework): 6.1.4.3
other applicable versions (graphql-batch
, etc)
GraphQL schema
module Types
class ConnectionType < Types::BaseObject
field :id, GraphQL::Types::Int, null: false
field :name, GraphQL::Types::String
# about 40 more...
end
end
GraphQL query
All the fields here are strings, booleans, IDs, or integers
query connections($userId: ID) {
connectionsV2(userId: $userId) {
connectionId
connectionSha
createdAt
database
databaseSha
description
firebaseId
folderId
id
lastOpenedAt
lastOpenedByMe
lastRanByMe
lastRunAt
name
othersCanEdit
othersCanView
passwordProtected
public
ranByMeCount
role
runCount
statement
updatedAt
userId
warehouse
}
}
{
"data": {
"connectionsV2": […]
}
}
Steps to reproduce
- Create ~1000 ActiveRecord objects with at least 15 properties
- Execute a GraphQL query for those objects and properties
- Performance should take 4-8s
Expected behavior
I'm not sure if this is a reasonable approximation, but it takes less than a second to serialize the same 1k objects in JSON so I would expect GraphQL to be able to resolve fields within 1-2s.
Actual behavior
What specifically went wrong?
It takes > 2s
Additional context
Here's the full flamegraph: graphql-1k-flamegraph.html.zip
And another version with 100 objects since that one might be difficult to load or read: graphql-100-flamegraph.html.zip
Hi, thanks for the detailed report! What a doozy. I'll take a close look at the flamegraph soon -- thanks for sending it.
In the meantime, I see that AppSignalTracing is installed. Are you able to generate a flamegraph or profile without that (eg, commenting it out in development)? I'm just curious if that's part of it, and someone just opened a related issue: #3945
Sure thing, here's a couple flamegraphs without AppSignal:
Thanks for sharing that. Looking over it more closely, it's not obvious to me that GraphQL-Ruby is the main bottleneck here. In general, I see lots of samples that point at application code (shown by the non-green "peaks" on the flamegraph). For example:
-
Query
methods (fromapp/models/query.rb
) -
ActiveSupport::TimeWithZone#xmlschema
-
RoleModel::Implementation
methods -
ActiveRecord::AttributeMethods#respond_to?
So, speeding up those methods (or batching database loads, simplifying implementations, etc) might yield good results.
For comparison, I sketched out a minimal schema that would run the query above:
Minimal script
require "bundler/inline"
gemfile do
gem "graphql", "1.13.8"
end
require "ostruct"
require "benchmark"
class ExampleSchema < GraphQL::Schema
class Connection < GraphQL::Schema::Object
field :connection_id, ID
field :connection_sha, String
field :created_at, String
field :database, String
field :database_sha, String
field :description, String
field :firebase_id, ID
field :folder_id, ID
field :id, ID
field :last_opened_at, String
field :last_opened_by_me, String
field :last_ran_by_me, String
field :last_run_at, String
field :name, String
field :others_can_edit, Boolean
field :others_can_view, Boolean
field :password_protected, Boolean
field :public, Boolean
field :ran_by_me_count, Integer
field :role, String
field :run_count, Integer
field :statement, String
field :updated_at, String
field :user_id, ID
field :warehouse, String
end
class Query < GraphQL::Schema::Object
field :connectionsV2, [Connection], connection: false do
argument :user_id, ID, required: false
end
end
query(Query)
end
query = <<-GRAPHQL
query connections($userId: ID) {
connectionsV2(userId: $userId) {
connectionId
connectionSha
createdAt
database
databaseSha
description
firebaseId
folderId
id
lastOpenedAt
lastOpenedByMe
lastRanByMe
lastRunAt
name
othersCanEdit
othersCanView
passwordProtected
public
ranByMeCount
role
runCount
statement
updatedAt
userId
warehouse
}
}
GRAPHQL
data = 1000.times.map { {} }
result = nil
Benchmark.bm { |x|
x.report {
result = ExampleSchema.execute(query, root_value: { connectionsV2: data })
}
}
# pp result
puts "Result size: #{result["data"]["connectionsV2"].size}"
GraphQL-Ruby 1.13.8 takes 0.25s to run that query on my machine:
$ ruby perf_test.rb
user system total real
0.228234 0.001495 0.229729 ( 0.230060)
Result size: 1000
Anyhow, that doesn't mean improvements to the gem couldn't improve the stack trace you shared.
For example, GraphQL-Ruby calls respond_to?(...)
every time to see if a field should call a method. In the flamegraph you posted above, it turns out to be pretty expensive, judging by the number of nested calls that show up there. In theory, graphql-ruby could "JIT" and remember how it resolved a field last time it ran. But that'd be a pretty big change, and since schemas are shared between queries, it'd be hard to implement in a non-race-y way.
Unfortunately, I'm not able to glean any actionable items from that flamegraph :confounded: If you want to keep investigating, you could try gathering a raw: true
dump from StackProf (https://github.com/tmm1/stackprof). One advantage to stackprof is that it quantifies how many samples have each method on the top of their trace -- it might clarify exactly what's doing work vs. what's waiting. (The large green part of the attached flamegraph is GraphQL -- but most of the methods in that backtrace are just sitting there, waiting for method calls to return. ) Additionally, it would provide an easier-to-grok table of exactly which parts of GraphQL-Ruby overhead are taking the longest in this case. (Scanning the flamegraph, I don't see anything out of order -- just a lot of methods I've already tried to make as fast as I can! But maybe something new would show up from that perspective).
A memory profile with memory_profiler
might also yield some useful feedback. Although it's not a 1:1 connection between memory and runtime performance, sometimes object allocations provide insight into poor performance at the code's hotspots.
I don't have plans for any more work specifically from this issue, so I'll close it. But I have been doing a lot of performance work lately, so I hope you'll find better performance on 2.0.18 and 2.0.19 (and future versions)!