graphql-platform icon indicating copy to clipboard operation
graphql-platform copied to clipboard

System.NullReferenceException thrown from the OperationComplexityMiddleware class

Open kcybulka opened this issue 3 years ago • 3 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Describe the bug

We recently enabled complexity calculation for the queries using following configuration:

.ModifyRequestOptions(x => 
{   
    x.Complexity.Enable = true;  
    x.Complexity.ApplyDefaults = true;  
    x.Complexity.DefaultComplexity = 1;  
    x.Complexity.DefaultResolverComplexity = 5;  
    x.Complexity.MaximumAllowed = int.MaxValue;  
})  

Unfortunately after rolling it out to production NullReferenceException started to occur in the log and it seems to be coming from the inside of the library (HotChocolate.Execution.Pipeline.OperationComplexityMiddleware).

Steps to reproduce

It might be hard to track it down as I am not able to reproduce the issue. Following is the sample query along with variables that comes from the request that caused mentioned exception:

query MyQuery($typeNameVar: String!) {
  __type(name: $typeNameVar) {
    name
    fields {
      name
      isDeprecated
      type {
        name
        ofType {
          name
          kind
          ofType {
            name
          }
        }
      }
    }
  }
}

variables:

{"typeNameVar": "DeletedCurrency"}

It seems that issue is not related to the query itself because when I ran it on my dev machine or against the system where error originally occurred I am unable to reproduce it.

Relevant log output

System.NullReferenceException:
   at HotChocolate.Execution.Pipeline.OperationComplexityMiddleware+<InvokeAsync>d__7.MoveNext (HotChocolate.Execution, Version=12.7.0.0, Culture=neutral, PublicKeyToken=null)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at HotChocolate.Execution.Pipeline.OperationCacheMiddleware+<InvokeAsync>d__4.MoveNext (HotChocolate.Execution, Version=12.7.0.0, Culture=neutral, PublicKeyToken=null)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at HotChocolate.Execution.Pipeline.DocumentValidationMiddleware+<InvokeAsync>d__4.MoveNext (HotChocolate.Execution, Version=12.7.0.0, Culture=neutral, PublicKeyToken=null)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at HotChocolate.Execution.Pipeline.DocumentParserMiddleware+<InvokeAsync>d__4.MoveNext (HotChocolate.Execution, Version=12.7.0.0, Culture=neutral, PublicKeyToken=null)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at HotChocolate.Execution.Pipeline.DocumentCacheMiddleware+<InvokeAsync>d__5.MoveNext (HotChocolate.Execution, Version=12.7.0.0, Culture=neutral, PublicKeyToken=null)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at HotChocolate.Execution.Pipeline.TimeoutMiddleware+<InvokeAsync>d__3.MoveNext (HotChocolate.Execution, Version=12.7.0.0, Culture=neutral, PublicKeyToken=null)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at HotChocolate.Execution.Pipeline.ExceptionMiddleware+<InvokeAsync>d__3.MoveNext (HotChocolate.Execution, Version=12.7.0.0, Culture=neutral, PublicKeyToken=null)

Additional Context?

It seems that in most cases where this exception occurred requests was send manually from the BananaCakePop using port forwarding to the AKS cluster. At this moment I registered only 2 requests (out of around 630k) sent by other micro services that failed due to described issue.

Maybe there will be something obvious about the implementation from which the exception was thrown.

Thank you in advance for your support.

Product

Hot Chocolate

Version

12.7.0

kcybulka avatar Aug 05 '22 13:08 kcybulka

strange ... I will have a look at this... it could be related to the cache if the compiled complexity delegate. Not sure yet. I will review this together with @PascalSenn, there might be a logic error. Do you get these errors only after the system comes up or at random?

michaelstaib avatar Aug 06 '22 10:08 michaelstaib

On Monday I will try to verify it by restarting one of the pods and connecting to it right away. As for now I am not completely sure, there are thousands of requests processed daily by this API and I got only few dozens of this errors trough 48h after deployment. All except one was was manually triggered from the playground so it might be that first person that does that triggers the errors. I also remember that first exception I noticed was logged very soon after the deployment (and this was the only requests that was not manually sent) but on the other hand we have several pods/instances running and there was only one occurrence at the beginning.

kcybulka avatar Aug 06 '22 19:08 kcybulka

It seems to be happening on random. I have pod that is running already for 4 days without restart and I see new errors. Unfortunately I got one more coming from the standard request sent by other service (not manually triggered by BananaCakePop) , I will update the ticked as I was hoping it won`t be the case.

kcybulka avatar Aug 08 '22 10:08 kcybulka

Hi all, I'm facing this exception quite often when running K6 performance tests, like 10 users doing 100 simultaneous requests to GraphQL API. So, I have debugged it a bit and found 2 issues within OperationComplexityMiddleware logic.

  1. Cache looks like not working properly if query name specified - TryAdd method adds item with operationId+queryName key then Get method tries to resolve item with operationId only. https://github.com/ChilliCream/hotchocolate/blob/6cdb0dc100374eba3d66965650d94a3383ac4f77/src/HotChocolate/Core/src/Execution/Pipeline/OperationComplexityMiddleware.cs#L123 vs https://github.com/ChilliCream/hotchocolate/blob/6cdb0dc100374eba3d66965650d94a3383ac4f77/src/HotChocolate/Core/src/Execution/Pipeline/OperationComplexityMiddleware.cs#L62

In case the query name is specified it can't resolve item from the cache. Possible solution: to use the same key build approach for both methods. e.g. use operationId only.

  1. Is here, when it checks ReferenceEquals for 2 different instances of OperationDefinition class, not sure why they are different but that's why it returns null and null reference exception happens: https://github.com/ChilliCream/hotchocolate/blob/6cdb0dc100374eba3d66965650d94a3383ac4f77/src/HotChocolate/Core/src/Execution/Pipeline/OperationComplexityMiddleware.cs#L118

Possible solution: to use Equal method or another comparer or ReferenceEquals is correctly used there, but the root cause of the issue in 2 different instances. But the question then why this analyzer has been added to the cache in the next line anyway.

dkogithub avatar Oct 17 '22 15:10 dkogithub

@DkozhevnikovConfigit thank you for your analysis and sorry for letting this lie.

michaelstaib avatar Mar 17 '23 13:03 michaelstaib

We've submitted the https://github.com/ChilliCream/graphql-platform/pull/6099 to fix the issue.

marcioscpinheiro avatar Apr 28 '23 19:04 marcioscpinheiro

This is now fixed.

michaelstaib avatar May 01 '23 09:05 michaelstaib

I will create a preview today for this.

michaelstaib avatar May 01 '23 09:05 michaelstaib

The Fix is here: https://github.com/ChilliCream/graphql-platform/pull/6100

michaelstaib avatar May 01 '23 09:05 michaelstaib

@marcioscpinheiro thank you for your help, without you we would not have spotted this one.

michaelstaib avatar May 01 '23 09:05 michaelstaib