graphql-modules icon indicating copy to clipboard operation
graphql-modules copied to clipboard

Degraded performance with Apollo Server and v1

Open dackland opened this issue 4 years ago • 11 comments
trafficstars

Describe the bug

Apollo Server with GraphQL Modules v1 appears to be twice as slow as it was with v0.

To Reproduce

I've created a simple repo with three basic servers to demonstrate the performance differences. A load testing script performs as many requests as possible within 10 seconds.

Server Median Request Duration Request Count
Apollo Standalone 236ms 42
Apollo + GQL Modules v0 229ms 43
Apollo + GQL Modules v1 460ms 22

Each server uses the same schema, resolver, and JSON dataset. See the repo for more details and instructions. The test uses a larger dataset (3MB), but we observed a similar 100% latency increase even with smaller payloads.

Expected behavior

Using Apollo Server with GraphQL Modules v1 is similar or better performance to v0.

Environment:

  • OS: macOS
  • graphql-modules: 1.4.2
  • NodeJS: 16.3.0

Additional context

Using createApolloExecutor (as implemented here) instead of createSchemaForApollo seems to address the performance problem. However, using the custom executor broke field usage and resolver tracing in Apollo Studio.

I noticed your benchmarks for Apollo Server are all using createApolloExecutor, so it isn't capturing the slowdown that seems to happen when using createSchemaForApollo.

The docs recommend createSchemaForApollo for Apollo Server, but the API Reference calls both createSchemaForApollo and createApolloExecutor experimental.

Is there a recommended way of using v1 with Apollo Server that is performant and works with Apollo Studio? Thanks!

dackland avatar Jun 09 '21 12:06 dackland

@dackland Since ApolloServer isn't able to take a custom execute function in favor of graphql-js's default one(while the most of the other server solutions support that), we decided to give the users two options; createSchemaForApollo: uses wrapSchema from graphql-tools that wraps the generated schema with a custom execute and subscribe function so still the execution flow will work with pure graphql-js but it causes two rounds of execution(that might be the case) createApolloExecutor: generates an executor like federation gateway but it doesn't support subscriptions.

You can give a try "Envelop" that might help you to take your tracing and other stuff out of your server thanks to its plugin system; https://github.com/dotansimha/envelop

ardatan avatar Jun 09 '21 13:06 ardatan

@dackland Since ApolloServer isn't able to take a custom execute function in favor of graphql-js's default one(while the most of the other server solutions support that), we decided to give the users two options; createSchemaForApollo: uses wrapSchema from graphql-tools that wraps the generated schema with a custom execute and subscribe function so still the execution flow will work with pure graphql-js but it causes two rounds of execution(that might be the case) createApolloExecutor: generates an executor like federation gateway but it doesn't support subscriptions.

You can give a try to envelop that might help you to take your tracing and other stuff out of your server thanks to its plugin system; https://github.com/dotansimha/envelop

Just to add to that, we created a PR for adding support for custom subscribe function in apollo-server (>1y ago..): https://github.com/apollographql/apollo-server/pull/4167

dotansimha avatar Jun 09 '21 18:06 dotansimha

Thanks for the responses, @ardatan and @dotansimha! The two rounds of execution caused by wrapSchema seems to be a likely culprit. I'm still confused how v0 managed to accomplish good performance and integrate well with Apollo Studio, since it faces the same challenge of not having access to a custom execute function in Apollo Server.

We're at a bit of a crossroads moving forward with the upgrade to v1 because it feels like we have the choice of:

  1. Using createSchemaForApollo and accepting degraded performance.
  2. Using createApolloExecutor and sacrificing field usage statistics in Apollo Studio, in addition to finding other options for tracing resolvers.
  3. Staying at v0 where we currently get decent performance and Apollo Studio features.

Apologies if I'm missing something, but is this correct? Or is there a way to get the Apollo Studio features working with createApolloExecutor?

These are the field usage statistics I'm referring to: image

dackland avatar Jun 09 '21 19:06 dackland

v0 was using WeakMap by mapping the incoming request/context/session object with the final context weakly and it was trying to destroy that context by using res.on('end') but that wasn't the best solution to handle garbage collection(it causes memory leaks in some edge cases) so we decided to replace it by providing custom execute and subscribe functions which are extended versions of the default graphql-js.

The solution I can suggest is something like the following (Not sure if this code will work but the idea is to get the context and destroy it)

const { typeDefs, resolvers } = createApplication(...);
const contextDestroyMap = new WeakMap();
const server = new ApolloServer({
   typeDefs,
   resolvers,
   context: inputContext => {
      const { context, destroy } = createOperationController({ inputContext });
      contextDestroyMap.set(context, destroy);
      return context;
   },
   formatResponse: (res, { context}) => {
     const destroy = contextDestroyMap.get(context);
     destroy();
     contextDestroyMap.delete(context);
     return res;
   }
})

ardatan avatar Jun 09 '21 21:06 ardatan

Thanks for the suggestion @ardatan. The above snippet almost works. The context object seems to be re-created somewhere along the way, so attempting to use it in formatResponse as a key for the WeakMap fails.

However, since it's just the destroy method from createOperationController we're trying to call, could we not use the autoDestroy option?

const server = new ApolloServer({
  typeDefs,
  resolvers,
  context: (inputContext) => {
    const { context } = createOperationController({
      inputContext,
      autoDestroy: true,
    });
    return context;
  },
});

dackland avatar Jun 10 '21 20:06 dackland

@ardatan or @dotansimha, can you comment on the above code snippet? It has addressed our performance problems, and we're seeing field usage and tracing in Apollo Studio again.

I'd appreciate some feedback if this implementation is safe, or if we'll be looking at memory leaks. If it is indeed safe, it would be great to see some official guidance in the docs 😃

Thanks!

dackland avatar Jun 15 '21 12:06 dackland

@ardatan or @dotansimha, can you comment on the above code snippet? It has addressed our performance problems, and we're seeing field usage and tracing in Apollo Studio again.

Since you are keeping the Apollo Executor as-is, you get the Apollo features, so it makes sense that it works now.

The createOperationController creates manual lifecycle management around the execution of GraphQL-Modules, so it means that it knows when the request has started (when you build the context) and when it ends. You can read more about it here: https://github.com/Urigo/graphql-modules/blob/675e6bb042510678674d8bec767de6668954223a/website/docs/advanced/lifecycles.md#manual-control-of-operation-cycle

I'd appreciate some feedback if this implementation is safe, or if we'll be looking at memory leaks. If it is indeed safe, it would be great to see some official guidance in the docs 😃

@kamilkisiela can you please elaborate on this? specifically on autoDestroy here, because I'm not sure how GraphQL-Modules knows that the execution has been done and can be cleared?

dotansimha avatar Jun 20 '21 15:06 dotansimha

@dackland do you have the latest specs on performance after v1 upgrade with this suggested solution? I'm thinking of using gq modules for an enterprise grade client project and just want to make sure it's safe to use and is future proof.

thank you.

sahanatroam avatar Jul 09 '21 17:07 sahanatroam

@sahanatroam We deployed the v1 upgrade two weeks ago using the above snippet. Performance is roughly the same as it was with v0, and average memory usage is down considerably.

dackland avatar Jul 09 '21 17:07 dackland

hey @dackland do you think you could kindly update your sample repo with the fix you applied? This would be super useful as I'm still getting degraded performance :(

sahanatroam avatar Aug 29 '21 00:08 sahanatroam

That's obvious, using context+resolvers will always be faster then having dependency injection on top of it.

The point of modules is to give you tools to create schema in teams.

kamilkisiela avatar Aug 29 '21 05:08 kamilkisiela