graphql-ruby icon indicating copy to clipboard operation
graphql-ruby copied to clipboard

Slow Performance With 1K Objects/15 Fields

Open choxi opened this issue 2 years ago • 3 comments

Describe the bug

I have a GraphQL query that fetches "Connection" objects and about 15 fields on each. It looks like field resolution takes 4-8s for ~1000 objects:

image

Here's the full flamegraph: graphql-1k-flamegraph.html.zip

Is this expected? If I only fetch one field instead of 15, it takes about 1s. We don't have any N+1 queries and most of the fields are being loaded directly from our database with no additional processing. Thanks for your help.

Versions

graphql version: 1.13.8 rails (or other framework): 6.1.4.3 other applicable versions (graphql-batch, etc)

GraphQL schema

module Types
  class ConnectionType < Types::BaseObject
    field :id, GraphQL::Types::Int, null: false
    field :name, GraphQL::Types::String
    # about 40 more...
  end
end

GraphQL query

All the fields here are strings, booleans, IDs, or integers

query connections($userId: ID) {
  connectionsV2(userId: $userId) {
    connectionId
    connectionSha
    createdAt
    database
    databaseSha
    description
    firebaseId
    folderId
    id
    lastOpenedAt
    lastOpenedByMe
    lastRanByMe
    lastRunAt
    name
    othersCanEdit
    othersCanView
    passwordProtected
    public
    ranByMeCount
    role
    runCount
    statement
    updatedAt
    userId
    warehouse
  }
}
{
  "data": {
    "connectionsV2": […]
  }
}

Steps to reproduce

  1. Create ~1000 ActiveRecord objects with at least 15 properties
  2. Execute a GraphQL query for those objects and properties
  3. Performance should take 4-8s

Expected behavior

I'm not sure if this is a reasonable approximation, but it takes less than a second to serialize the same 1k objects in JSON so I would expect GraphQL to be able to resolve fields within 1-2s.

Actual behavior

What specifically went wrong?

It takes > 2s

Additional context

Here's the full flamegraph: graphql-1k-flamegraph.html.zip

And another version with 100 objects since that one might be difficult to load or read: graphql-100-flamegraph.html.zip

choxi avatar Feb 15 '22 21:02 choxi

Hi, thanks for the detailed report! What a doozy. I'll take a close look at the flamegraph soon -- thanks for sending it.

In the meantime, I see that AppSignalTracing is installed. Are you able to generate a flamegraph or profile without that (eg, commenting it out in development)? I'm just curious if that's part of it, and someone just opened a related issue: #3945

rmosolgo avatar Feb 17 '22 19:02 rmosolgo

Sure thing, here's a couple flamegraphs without AppSignal:

graphql-1k-no-appsignal-flamegraph.html.zip (14.77s)

graphql-100-no-appsignal-flamegraph.html.zip (2.54s)

choxi avatar Feb 24 '22 21:02 choxi

Thanks for sharing that. Looking over it more closely, it's not obvious to me that GraphQL-Ruby is the main bottleneck here. In general, I see lots of samples that point at application code (shown by the non-green "peaks" on the flamegraph). For example:

  • Query methods (from app/models/query.rb)
  • ActiveSupport::TimeWithZone#xmlschema
  • RoleModel::Implementation methods
  • ActiveRecord::AttributeMethods#respond_to?

So, speeding up those methods (or batching database loads, simplifying implementations, etc) might yield good results.

For comparison, I sketched out a minimal schema that would run the query above:

Minimal script

require "bundler/inline"

gemfile do
  gem "graphql", "1.13.8"
end
require "ostruct"
require "benchmark"

class ExampleSchema < GraphQL::Schema
  class Connection < GraphQL::Schema::Object
    field :connection_id, ID
    field :connection_sha, String
    field :created_at, String
    field :database, String
    field :database_sha, String
    field :description, String
    field :firebase_id, ID
    field :folder_id, ID
    field :id, ID
    field :last_opened_at, String
    field :last_opened_by_me, String
    field :last_ran_by_me, String
    field :last_run_at, String
    field :name, String
    field :others_can_edit, Boolean
    field :others_can_view, Boolean
    field :password_protected, Boolean
    field :public, Boolean
    field :ran_by_me_count, Integer
    field :role, String
    field :run_count, Integer
    field :statement, String
    field :updated_at, String
    field :user_id, ID
    field :warehouse, String
  end

  class Query < GraphQL::Schema::Object
    field :connectionsV2, [Connection], connection: false do
      argument :user_id, ID, required: false
    end
  end

  query(Query)
end

query = <<-GRAPHQL
query connections($userId: ID) {
  connectionsV2(userId: $userId) {
    connectionId
    connectionSha
    createdAt
    database
    databaseSha
    description
    firebaseId
    folderId
    id
    lastOpenedAt
    lastOpenedByMe
    lastRanByMe
    lastRunAt
    name
    othersCanEdit
    othersCanView
    passwordProtected
    public
    ranByMeCount
    role
    runCount
    statement
    updatedAt
    userId
    warehouse
  }
}
GRAPHQL

data = 1000.times.map { {} }
result = nil
Benchmark.bm { |x|
  x.report {
    result = ExampleSchema.execute(query, root_value: { connectionsV2: data })
  }
}
# pp result
puts "Result size: #{result["data"]["connectionsV2"].size}"

GraphQL-Ruby 1.13.8 takes 0.25s to run that query on my machine:

$ ruby perf_test.rb
       user     system      total        real
   0.228234   0.001495   0.229729 (  0.230060)
Result size: 1000

Anyhow, that doesn't mean improvements to the gem couldn't improve the stack trace you shared.

For example, GraphQL-Ruby calls respond_to?(...) every time to see if a field should call a method. In the flamegraph you posted above, it turns out to be pretty expensive, judging by the number of nested calls that show up there. In theory, graphql-ruby could "JIT" and remember how it resolved a field last time it ran. But that'd be a pretty big change, and since schemas are shared between queries, it'd be hard to implement in a non-race-y way.

Unfortunately, I'm not able to glean any actionable items from that flamegraph :confounded: If you want to keep investigating, you could try gathering a raw: true dump from StackProf (https://github.com/tmm1/stackprof). One advantage to stackprof is that it quantifies how many samples have each method on the top of their trace -- it might clarify exactly what's doing work vs. what's waiting. (The large green part of the attached flamegraph is GraphQL -- but most of the methods in that backtrace are just sitting there, waiting for method calls to return. ) Additionally, it would provide an easier-to-grok table of exactly which parts of GraphQL-Ruby overhead are taking the longest in this case. (Scanning the flamegraph, I don't see anything out of order -- just a lot of methods I've already tried to make as fast as I can! But maybe something new would show up from that perspective).

A memory profile with memory_profiler might also yield some useful feedback. Although it's not a 1:1 connection between memory and runtime performance, sometimes object allocations provide insight into poor performance at the code's hotspots.

rmosolgo avatar Mar 01 '22 20:03 rmosolgo

I don't have plans for any more work specifically from this issue, so I'll close it. But I have been doing a lot of performance work lately, so I hope you'll find better performance on 2.0.18 and 2.0.19 (and future versions)!

rmosolgo avatar Mar 14 '23 12:03 rmosolgo