Fix NRE and interface duplication in multithreading scenario
fixes #2639
@Shane32 @diouxis I was able to reproduce bug and write a test.
@Shane32 That's exactly how you said. I agree that in general the bug may occur anywhere, because GraphQL.NET (at least for GraphType initialization) is not intended for multi-threaded scenarios. Nevertheless, fix is quite simple and test confirms that it works. Without fix I consistently got an error on each test run - from 0 to 300 iterations. I think it is worth merging this change, the code will not be worse from it and the loss of performance is ridiculous.
I believe this will only bury the problem deeper and make it more difficult to identify later. The problem still exists even in the user's code -- a multithreaded initialization of graph types -- and could easily occur in many situations that we didn't patch here. I personally would much rather it caused a problem so that users could identify and fix the cause. In another way of saying it, I'd rather the problem occurred 1 in 10 times than 1 in 1000 times. Because it will be the 1 in 1000 that's almost impossible to identify. At least when it occurs 1 in 10 times, you can probably fix it.
So really I think this is critical that we do not merge it. Of course it will not make any difference to anyone who writes their code correctly.
If we do want to issue a patch of some type, we should add an Initialized property to IGraphType and throw an exception if a graph type has already been initialized while SchemaTypes is initializing the schema (or some such thing). This is not fool-proof, but should reliably cause descriptive exceptions to be thrown in user code so they can fix the root cause. It also would not really affect schema initialization time. (I believe you already suggested this a while ago but I could be wrong.)
We also should fix the design flaw in GraphQL.NET - the InterfaceGraphType.ResolveType only works with instances, not types, and so doesn't support transient-registered graph types, which is how the AddGraphTypes() builder extension registers graph types by default. As far as I know there's no other area where a type cannot be provided rather than an instance.
Your thoughts?
I understand every your word about 1/10 and 1/1000.
If we do want to issue a patch of some type, we should add an Initialized property to IGraphType and throw an exception if a graph type has already been initialized while SchemaTypes is initializing the schema (or some such thing)
See https://github.com/graphql-dotnet/graphql-dotnet/issues/2639#issuecomment-956097505 about "shared" Money type. I do not see an obvious way to achive this with such Initialized property.
I believe this will only bury the problem deeper
I would look at it on the other side - the problem in practice can be eliminated at all.
If/when we revise the design (in vNext), then I'm fine to remove this fix, but now my intuition says it is better to keep it (60/40).
Rereading your comments...
I consider the case with singleton graphtype and scoped schema/schemas like "rather normal".
But... this is impossible. During initialization, middleware modifies all graph types attached to the schema. If there ever was a case of multiple schema instances accessing the same graph type, middleware would be reapplied multiple times for those shared instances. It's absolutely critical that a schema instance use its own graph type instances.
Middleware is actually a perfect example -- if this PR was merged, it would be more difficult to tell why middleware was being reapplied. (Which is also a memory leak)
Another situation is if a singleton graph type accesses a scoped graph type. It becomes 'random' which scoped instance of a graph type is used by a singleton graph's field definition. So then code in one thread can access a scoped service from another thread. Users would have no idea why they might get, for example, an ObjectDisposedException for a service. There's nothing that would point to GraphQL.NET or service registration as being the fault.
I do not see an obvious way to achive this with such Initialized property.
I'm not sure what you mean. During initialization, the schema checks the property, throws if it is true, otherwise sets it to true. Are you saying you cannot have a shared graph type instance between schemas? That's not possible anyway.
I would look at it on the other side - the problem in practice can be eliminated at all.
The way I see it, for this particular user, it appears to help the issue. But he's the exception. If anything, most people would have more issues, not less. (Not more because of this PR, but their other existing issues would be harder to identify.)
If/when we revise the design
Is it really so hard to make such a fix? We can't add a property to an interface, but we can probably do most of the fix pretty easily. (That is, throwing an exception if a graph type is initialized more than once.)
Are you saying you cannot have a shared graph type instance between schemas?
Yes.
That's not possible anyway.
😮 Why so?
Forget about middleware for now. The re-applying of middleware can be critical or not. It depends. Forget even about scoped schemas. Imagine 2/3/4/5 singleton schemas of different type inside one app (one DI container). Each of them uses some own types as well as some "shared" singleton types, Money or Account for example. With certain assumptions, it is possible to consider even the introspection types as an example because they are used by all the schemas.
I can not say that such a scenario should be surprising. And from my point of view, it must be supported as much as possible.
Regardless of whether we would like it to be supported, it completely breaks GraphQL.NET right now. Middleware is just one example. Multithreading issues, scoped graph types, and memory leaks are other prime examples. Even the name converter won’t work correctly if the schemas use different name converters (granted this may be unlikely.)
We were running into issues a year ago because the instances of the introspection fields were static!
It is completely impossible with the current design to support multiple schema instances accessing the same instances of graph types.
I do think this is a worthy goal but this PR does not help to achieve it. We need to rewrite the infrastructure from the ground up.
I was working towards this goal a couple years ago, starting with trying to make all the interfaces read only so that once created the graph types could not change. I had a lot of code done towards this effort. You nixed the idea so I dropped it.
All this will do is make the *** countless *** issues arising from sharing graph types harder to identify the cause of.
Considering a singleton schema and singleton graph types: it is still subject to problems with middleware, the name converter and multithreaded initialization. Even if everything worked the question is: why? Why should we deal with these problems at all? We already have the defaults set up so that graph types are registered as transients by default, which fully support scoped schemas, singleton schemas, or multiple singleton schemas.
So why? It can’t be to save time - if they are singleton schemas then they only need to be initialized once. And regardless they will all initialize all graph types so there’s no time savings. It can’t be to run faster. That only leaves what - to save memory? What are talking - a couple MB? Probably less? My server has 192GB of RAM. My smallest virtual servers have 512MB of RAM. This isn’t even worth mentioning. It certainly doesn’t save runtime allocations.
If the infrastructure were set up to support shared graph types such that it would not need reinitialize a graph type more than once, then it could perhaps save a lot of initialization time for a scoped schema. But right now all graph types are reinitialized when created.
Keep in mind the only purpose of a scoped schema is to have some scoped graph types. And if a singleton graph type is holding a reference to a scoped graph type it’s guaranteed to have multithreading issues where one thread accesses a disposed scope of the other thread. We can put locks on all the constructors and make everything multithreaded-safe during initialization but there’s simply no way around the fact that a field resolver for one scope can execute within another scope (which could already be disposed) without redoing the infrastructure. And that has nothing to do with middleware.
I was working towards this goal a couple years ago, starting with trying to make all the interfaces read only so that once created the graph types could not change
I remember this and still think that this does not solve the problem. By the way, I just checked that middleware works with the introspection types too. I think we may postpone this conversation - little PR raised big redesign questions.
I remember this and still think that this does not solve the problem.
Agree. It was intended to be a start. But I do not mind if we have read-write interfaces now. There's another PR also #2016 that I think could be part of this conversation, when we want to pick it up again. Conceptually, if we have graph types become "definitions" only, and then the schema builds it's own copies to use during execution, potentially with optimized execution paths, we could solve many of these issues and realize faster execution. Who knows; too much to think about now though. I do not see it as a goal for v5. The thing is that the current design is incredibly useful for writing small tests which may not be possible if we make changes to achieve some of the prior goals I just noted. Anyway......
Codecov Report
Merging #2830 (3d51340) into master (42f8a3d) will increase coverage by
0.01%. The diff coverage is100.00%.
@@ Coverage Diff @@
## master #2830 +/- ##
==========================================
+ Coverage 83.77% 83.78% +0.01%
==========================================
Files 366 366
Lines 15017 15029 +12
Branches 2375 2377 +2
==========================================
+ Hits 12580 12592 +12
Misses 1802 1802
Partials 635 635
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/GraphQL/Types/Collections/PossibleTypes.cs | 76.47% <100.00%> (+12.83%) |
:arrow_up: |
| ...rc/GraphQL/Types/Collections/ResolvedInterfaces.cs | 70.58% <100.00%> (+16.04%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 42f8a3d...3d51340. Read the comment docs.
I suggest closing this now that #3248 has been merged.