sentry-python StrawberryIntegration slows down query performance dramatically

How do you use Sentry?

Sentry Saas (sentry.io)

Version

2.17.0

Steps to Reproduce

Set up a Django project with Strawberry async.
Set up a query for an endpoint that returns many objects. I tested with 2,000 objects and only 2 fields for each - an ID and a string, e.g.

@strawberry_django.type(models.Item, fields=["id", "value"], pagination=True)
class Item:
   pass


@strawberry.type
class Query:
    items: list[types.Item] = strawberry_django.field()

query Items {
  items(pagination: { limit: 2000 }) {
    id
    value
  }
}

Expected Result

Not much performance difference with or without StrawberryIntegration

Actual Result

With StrawberryIntegration, this took 3-4x longer. In my case, about 800ms. Without it, I got the results back in 200-250ms.

Oct 18 '24 04:10 pcraciunoiu

Hey @pcraciunoiu, thanks for reporting.

I'm thinking this might be due to the SDK in general having a lot of room for improvement in the async department. Might be the transport blocking. (There are plans for a fully async transport/SDK, but this will take some time.)

Couple things to check:

If you don't configure the DSN (i.e., make the SDK not actually send anything out), do you still observe this? If not, my guess is that it's the transport.
Do you by any chance have profiling on and can see where we spend the extra time?

Oct 18 '24 12:10 sentrivana

Hey @pcraciunoiu, thanks for reporting.

I'm thinking this might be due to the SDK in general having a lot of room for improvement in the async department. Might be the transport blocking. (There are plans for a fully async transport/SDK, but this will take some time.)

Couple things to check:
* If you don't configure the DSN (i.e., make the SDK not actually send anything out), do you still observe this? If not, my guess is that it's the transport.

It doesn't make a difference if the dsn value is passed in or not. Only the StrawberryIntegration makes a difference, and I have to set auto_enabling_integrations=False to ensure it doesn't get added automatically.

* Do you by any chance have profiling on and can see where we spend the extra time?

I did try with pyinstrument, but nothing stood out. I had to trial and error for a few hours until I figured out it's sentry, and then which integration.

I'll attach it in case it helps. If you have recommendations for other profiling tools that might do better, I'd like to learn about them.

1.15s - middleware.py_54 - pyinstrument.zip

Oct 18 '24 15:10 pcraciunoiu

@pcraciunoiu Interesting, I also did not see anything in the profile you attached, although I am not really sure how pyinstrument works (sampling or deterministic).

I think @sentrivana was suggesting that you could try using Sentry's built-in profiler. You can enable the profiler by setting profiles_sample_rate to a value above zero when initializing the Sentry SDK. You also need to set traces_sample_rate to a value above zero. The traces_sample_rate controls the fraction of Sentry transactions which are sent to Sentry, and the profiles_sample_rate controls the fraction of transactions which are profiled. Here is an example of how to set these.

Then, you can view the generated profiles directly in Sentry linked to the corresponding transaction.

Oct 21 '24 11:10 szokeasaurusrex

@szokeasaurusrex for the profiling I sent, I do not have sentry set up yet as it's a new project.

I do have another project that was running the strawberry integration until late last week, when I found this issue and disabled it. I confirmed on that project too, that it was adding on 150ms or more to GraphQL requests as I tested locally. I have some profiling from last week before I disabled this integration, but it's a private repository and I'm not comfortable including links or details here. If you want to, I can share a sentry link privately via email or otherwise, and/or you can let me know what you need to look into this further.

Oct 21 '24 16:10 pcraciunoiu

@pcraciunoiu Happy to look at the Sentry link -- you can send it to ivana.kellyer (at) sentry.io.

I will try to repro this today. Also want to compare the performance with a sync Django/Strawberry app. (My money is still on this being a general problem with the SDK blocking.) Will report back.

Oct 22 '24 09:10 sentrivana

So I'm seeing an increase too, and only with the async extension. I assume you're setting StrawberryIntegration(async_execution=True) explicitly? If so, can you try with async_execution=False? This makes the extension much faster for me even when using strawberry-django's AsyncGraphQLView.

Also, could you post your sentry_sdk.init? Are you using performance/tracing?

Oct 22 '24 15:10 sentrivana

@pcraciunoiu Happy to look at the Sentry link -- you can send it to ivana.kellyer (at) sentry.io.

Just sent you an email

So I'm seeing an increase too, and only with the async extension. I assume you're setting StrawberryIntegration(async_execution=True) explicitly? If so, can you try with async_execution=False? This makes the extension much faster for me even when using strawberry-django's AsyncGraphQLView.

I can confirm it's much faster with async_exection=False, but it's still 200ms slower. My test case is 2000 items with a related nested item through a data loader, let's pretend it's this:

query Articles {
   articles({ pagination: { limit: 2000 }) {
     id
     title
     status
     author {  # data loader here
        id
        firstName
     }
  }
}

This is what I'm seeing (roughly from running each 10x)

No integration - average 700ms
With integration but no async execution - average 900ms
With integration and async execution - average 1,500ms

I assumed that async execution is required to use in an async environment, but I guess that's not true? Still, there is a noticeable slowdown.

Oct 24 '24 23:10 pcraciunoiu

Also, could you post your sentry_sdk.init? Are you using performance/tracing?

Here it is. I've tried both with/without sample rates for traces/profiles. Doesn't seem to matter, but that's probably because the DSN string is empty.

sentry_sdk.init(
    dsn=SENTRY_DSN,
    integrations=[
        DjangoIntegration(),
        CeleryIntegration(),
        LoggingIntegration(level=logging.INFO),
       #  StrawberryIntegration(async_execution=True),
    ],
    auto_enabling_integrations=False,
    release=GIT_SHA,
    environment=ENV,
    send_default_pii=True,
    # traces_sample_rate=0.5,
    # relative to traces sample rate
    # profiles_sample_rate=0.5,
)

Oct 24 '24 23:10 pcraciunoiu

Does setting the SENTRY_DSN to your actual DSN make a difference?

Oct 28 '24 14:10 szokeasaurusrex

Does setting the SENTRY_DSN to your actual DSN make a difference?

It does not seem to make a difference from my quick tests

Oct 31 '24 15:10 pcraciunoiu

The profile in Sentry looks thoroughly uneventful, but the pyinstrument profile you attached is interesting. Seems like there's a lot of time spent waiting on threading locks/conditions. Don't know if that's what's causing the slowness but we can verify that.

Would you be able to get us a pyinstrument profile like the one above for the query:

without Sentry
with Sentry and Strawberry integration enabled sync
with Sentry and Strawberry integration enabled async
optionally also with Sentry but Strawberry integration disabled (just with disabled_integrations=[StrawberryIntegration()] (see here) so that you still get all the other integrations that would be enabled in options 2 and 3)

We could then see what stands out.

Feel free to attach here or send them to me via email.

Nov 04 '24 09:11 sentrivana

@sentrivana I finally got around to this and emailed you those files. It is definitely slowest with the strawberry integration async.

Nov 26 '24 17:11 pcraciunoiu

Thanks @pcraciunoiu, got the files and will take a look in the next couple of days.

Nov 28 '24 08:11 sentrivana

@pcraciunoiu So I dug a bit deeper into this. Long story short, we're using a custom Strawberry extension in our integration. The only difference between the sync and async version of the extension is the resolve method. Thing is, the performance difference is still there even when you strip all Sentry code from both versions of the method and only leave the most basic behavior.

To verify, I made a sample Django app with strawberry-django without Sentry and with this in my schema.py:

import strawberry
import strawberry_django
from strawberry_django.optimizer import DjangoOptimizerExtension
from strawberry.extensions import SchemaExtension

from inspect import isawaitable
from .types import Fruit


class MyExtension(SchemaExtension):
    # # Uncomment this and comment out the async version to switch
    # def resolve(self, _next, root, info, *args, **kwargs):
    #     return _next(root, info, *args, **kwargs)

    async def resolve(self, _next, root, info, *args, **kwargs):
        result = _next(root, info, *args, **kwargs)
        if isawaitable(result):
            result = await result
        return result


@strawberry.type
class Query:
    fruits: list[Fruit] = strawberry_django.field()

schema = strawberry.Schema(
    query=Query,
    extensions=[
        DjangoOptimizerExtension,  # not required, but highly recommended
        MyExtension,
    ],
)

I get the same performance difference using the sync vs. async version of resolve. So it seems this just comes with how extensions work.

Now the question for me would be if this is expected or if we're doing something wrong. Maybe @patrick91 could weigh in on this. TL;DR: The async version of the extension in the example above considerably slows down the endpoint, is this expected? The sync version adds almost no overhead in comparison. The app is just the example app from https://github.com/strawberry-graphql/strawberry-django?tab=readme-ov-file#basic-usage, using AsyncGraphQLView.

Dec 02 '24 16:12 sentrivana

If I recall correct is awaitable is a slow check, also we are calling resolver for all fields, also the ones that might not need it

@sentrivana maybe you can add this check for the basic fields? https://github.com/strawberry-graphql/strawberry/blob/main/strawberry/extensions/tracing/utils.py#L12-L20

Dec 09 '24 22:12 patrick91

@patrick91 Thanks for getting back to me. So do I understand this right, adding should_skip_tracing should allow us to skip resolving stuff that doesn't need resolving? If should_skip_tracing is True, can we just skip either the _next (here) or the isawaitable check (here) completely?

Or did you mean something different by adding it for the basic fields? I assume you mean in the app?

Dec 10 '24 16:12 sentrivana

@patrick91 you're correct with that you can skip the tracing entirely:

https://github.com/strawberry-graphql/strawberry/blob/6553c9e3c67249f538cd3af934e78322a24963a7/strawberry/extensions/tracing/opentelemetry.py#L157-L171

I think on our side we could find ways to skip the isawaitable, maybe by adding a property on the resolver, but I don't know yet if that's easy to do 😊

Dec 10 '24 18:12 patrick91

I'll keep this open for us to investigate whether we can get rid of the isawaitable in the extension.

Dec 11 '24 13:12 sentrivana

~~Just as an update, I'm trying to find a way to make the check faster, but my current attempt is surprisingly slower, see: https://github.com/strawberry-graphql/strawberry/pull/3724~~

I've also done some benchmarks on the isawaitable function and it doesn't seem like it has this big of a performance hit, so it might be a combination of isawaitable + await result, but I'm not sure yet

EDIT: maybe PR had an error which I only just noticed

Dec 12 '24 19:12 patrick91