graphql-php icon indicating copy to clipboard operation
graphql-php copied to clipboard

Improve performance by disabling return/resolved values validation

Open Warxcell opened this issue 1 year ago • 19 comments

What do you think about this one? The use-case is to gain a little bit of performance, by disabling resolver's returned value validation - and just trusting it.

No idea how much will gain - just throwing it in the wild and see what people think. :)

Warxcell avatar Dec 07 '23 22:12 Warxcell

My instinct says that the additional check for whichever setting controls this behaviour might eat up some of the benefits afforded by it. Making this as fast as possible would require having a code path that does no checks at all, but also does not checks if that behaviour is enabled. This would require duplicating/rewriting substantial parts of the executor, as well as possibly swapping the default field resolver and other parts that have sanity/safety checks.

Feel free to explore this in a pull request, along with benchmarks to test the effectiveness. You can also use a custom executor in your own project, see https://github.com/webonyx/graphql-php/blob/3ae418579b555253eeb3afbf2ad69dc1a770867f/src/Executor/Executor.php#L78. I don't think this is going to be a worthwhile endeavours and am thus closing this issue, but I am also open to being convinced otherwise with working code and the benchmarks to prove it.

spawnia avatar Dec 11 '23 09:12 spawnia

FTR I would be interested in this.

Once payloads returned exceed a significant size (in the multi-mega-byte range), I've seen that when profiling (used xdebug & PhpStorm xdebug snapshot analyzer) the micro-overhead for a single field becomes measurable in the big picture to the point it starts to rival the slowest part in those requests, usually the database.

mfn avatar Dec 13 '23 10:12 mfn

FTR?

spawnia avatar Dec 13 '23 11:12 spawnia

For The Record

mfn avatar Dec 13 '23 11:12 mfn

btw https://github.com/zalando-incubator/graphql-jit have thing called

disableLeafSerialization {boolean, default: false} - disables leaf node serializers. The serializers validate the content of the field at runtime so this option should only be set to true if there are strong assurances that the values are valid.

I will try to make test in few days to see what's the gain.

@mfn can you point me where most execution time comes from?

Warxcell avatar Dec 15 '23 08:12 Warxcell

I can't share the xdebug cachegrind output but I can leave a screenshot 😅 Hope it's useful: image

Of course Laravel / Eloquent also appears a LOT there.

But if you look at own time vs. amount of calls, I think even small optimizations can pay off (ofc. also in non-GraphQL parts)

The json payload here generated (uncompressed) was about 3.8MB

Although it would be quite involved, I would be able to re-run code on similar complex results and compare them, if we've something to test.

mfn avatar Dec 15 '23 10:12 mfn

@spawnia another thing that could improve performance is skipping

            $validationErrors = DocumentValidator::validate($schema, $documentNode, $validationRules);

            if ($validationErrors !== []) {
                return $promiseAdapter->createFulfilled(
                    new ExecutionResult(null, $validationErrors)
                );
            }

in case if document is cached. If document is cached - it was already validated somewhere in the past.

Warxcell avatar Dec 15 '23 18:12 Warxcell

before image

after: image

basically 100-200ms improvement on one specific query on our side.

 $cacheItem = $this->queryCache->getItem(md5($op->query));

            if ($cacheItem->isHit()) {
                $documentNode = AST::fromArray($cacheItem->get());
            } else {
                $documentNode = Parser::parse($op->query);

                $queryComplexity = DocumentValidator::getRule(QueryComplexity::class);
                assert(
                    $queryComplexity instanceof QueryComplexity,
                    'should not register a different rule for QueryComplexity'
                );

                $queryComplexity->setRawVariableValues($op->variables);


                $validationErrors = DocumentValidator::validate($this->schema, $documentNode);

                if ($validationErrors !== []) {
                    return $promiseAdapter->createFulfilled(
                        new ExecutionResult(null, $validationErrors)
                    );
                } else {
                    $cacheItem->set(AST::toArray($documentNode));
                    $this->queryCache->save($cacheItem);
                }
            }

But i'm not sure how to incorporate this into here. I basically copied StandardServer, Helper and Grapql::promiseToExecute into my bundle in order to achieve that.

https://github.com/Warxcell/graphql-bundle/blob/main/src/Controller/GraphQL.php#L231-L257

Any advice how to proceed with this one?

Warxcell avatar Dec 15 '23 19:12 Warxcell

I can't share the xdebug cachegrind output but I can leave a screenshot 😅 Hope it's useful: image

Of course Laravel / Eloquent also appears a LOT there.

But if you look at own time vs. amount of calls, I think even small optimizations can pay off (ofc. also in non-GraphQL parts)

The json payload here generated (uncompressed) was about 3.8MB

Although it would be quite involved, I would be able to re-run code on similar complex results and compare them, if we've something to test.

I've tested with blackfire, and from 300ms total response time, we have 80ms lost in is_iterable checks (173 times called).

image

@spawnia what about using assert(is_iterable()) (it won't be executed on production, hence the optimization) ?

Warxcell avatar Dec 15 '23 21:12 Warxcell

The performance of an API is a key point. Any small improvement in the response time would be very interesting! (and these benchmarks look promising)

SebastienTainon avatar Mar 19 '24 17:03 SebastienTainon

The performance of an API is a key point. Any small improvement in the response time would be very interesting! (and these benchmarks look promising)

you can achieve same results by copy-paste-modify StandardServer.php

You can see what I've done here: https://github.com/Warxcell/graphql-bundle/blob/main/src/Controller/GraphQL.php ( think https://github.com/Warxcell/graphql-bundle/blob/main/src/Controller/GraphQL.php#L254-L280 is the most important part - it skips parsing && validation of query if its cached)

The question is - Can we incorporate that directly into this package?

Warxcell avatar Mar 19 '24 18:03 Warxcell

@spawnia what about using assert(is_iterable()) (it won't be executed on production, hence the optimization) ?

I don't want to change the default behaviour of this library in production settings. I very much appreciate how it prevents bad data from ever reaching clients, since we can be fearless and not use defensive programming there.

spawnia avatar Mar 20 '24 08:03 spawnia