router icon indicating copy to clipboard operation
router copied to clipboard

`__typename` of the root type is not returned if a fragment is deferred

Open martinbonnin opened this issue 2 years ago • 8 comments

Given the below query:

{
  __typename
  fast
  ...deferedFragment @defer
}

fragment deferedFragment on Query {
  slow
}

The __typename is not included in the initial payload:

$ curl --request POST \
    --header 'Accept: multipart/mixed; deferSpec=20220824;' \
    --header 'content-type: application/json' \
    --url 'http://0.0.0.0:4040/' \
    --data '{"query":"{\n  __typename\n  fast\n  ...deferedFragment @defer\n}\n\nfragment deferedFragment on Query {\n  slow\n}","variables":{}}'

--graphql
content-type: application/json

{"data":{"fast":0},"hasNext":true}

more info

config.yaml

server:
  experimental_defer_support: true
  listen: 0.0.0.0:4040
cors:
  origins:
    - https://studio.apollographql.com
plugins:
  experimental.include_subgraph_errors:
    all: true

supergraph.graphqls

type Query
  @join__type(graph: PANDAS)
{
  slow: Int @join__field(graph: PANDAS)
  fast: Int @join__field(graph: PANDAS)
}

router: 6b83a74b3c3df08a7e1360e03462164fa2d71532

Still happengin at 055183276dffc6eaa9477b2abd3fa612482ad8b6

martinbonnin avatar Sep 01 '22 13:09 martinbonnin

Similar reproduction just using supergraph-demo-fed2:

Query

query DeferVariation {
  __typename
  allProducts {
    delivery {
      ...MyFragment @defer
    }
    sku
  }
}
fragment MyFragment on DeliveryEstimates {
    estimatedDelivery
    fastestDelivery
}

Query Plan

QueryPlan {
  Defer {
    Primary {
      {
        allProducts {
          sku
        }
      }:
      Sequence {
        Fetch(service: "products") {
          {
            allProducts {
              __typename
              ... on Product {
                __typename
                id
                dimensions {
                  size
                  weight
                }
              }
              sku
            }
          }
        },
        Flatten(path: "allProducts.@") {
          Fetch(service: "inventory") {
            {
              ... on Product {
                __typename
                id
                dimensions {
                  size
                  weight
                }
              }
            } =>
            {
              ... on Product {
                delivery {
                  estimatedDelivery
                  fastestDelivery
                }
              }
            }
          },
        },
      }
    }, [
      Deferred(depends: [], path: "[email protected]") {
        {
          ... on DeliveryEstimates {
            estimatedDelivery
            fastestDelivery
          }
        }:
      },
    ]
  },
}

abernix avatar Sep 06 '22 21:09 abernix

Fwiw, the fact that the query planner does not include the top-level __typename is not specific to @defer: this is always true and is an historical behaviour. I'm pretty sure the fed1 query planner did that too (https://github.com/apollographql/federation/blob/version-0.x/query-planner-js/src/buildQueryPlan.ts#L784). I wasn't there when that choice was done initially, and so I'm not sure of why that was done initially, but a possible rational is that query plans are about querying subgraphs, but there is no need to ask a subgraph to know the __typename of root types, so it can be a bit more efficient to let the gateway/router handle it. Typically, if someone sends the trivial query:

{
  __typename
}

then it's wasteful to request a subgraph to get the answer, but it is what would happen if the query planner was handling it.

Not suggesting that this is a super strong rational since optimising this kind of trivial request is probably not of the utmost importance. But it's not en entirely crazy rational either and is the only one I can think of for that historical choice.

pcmanus avatar Sep 07 '22 09:09 pcmanus

Thanks, @pcmanus! The rational and historical aspect of it makes enough sense to me that I think we can get a fix in for this on our end.

abernix avatar Sep 07 '22 10:09 abernix

Now that I think about this a bit more, I realise that handling top level __typename is not necessary trivial to do, and so maybe relying on the execution side to deal with it is not ideal either.

That is, historically, we've rely for this on the fact the gateway was triggering graphQL-js execution before returning the final response, and we know this handle every possible corner case here. But we also know that we'd like to make execution more lightweight and that relying on full graphQL execution is more of a work-around we've kept for too long than something desirable.

What I'm getting at here is that the QP actually ignores top-level __typename after fragment expansion and regardless of any directive applied to it. So for instance, the router would have to handle cases like:

{
  __typename @include(if: false)
}

or

{
  ...Foo
}

fragment Foo on Query {
  __typename
}

and I don't know how hard dealing with that is for the router at the moment, but difficulty aside, if we want to make execution as simple as possible, then may requiring fragment expansion and handling of directives is not ideal.

Add to that the fact that the code that skip top-level __typename in the QP is not that consistent either. For instance, while __typename is skipped in the 2 example above, it is not skipped if you do:

{
    ... on Query {
       __typename
    }
}

Anyway, starting to wonder if the benefit of being a tad more optimal on the somewhat useless query asking only for __typename really outweigh the cost of adding quite a bit of complexity to the execution side ...

pcmanus avatar Sep 07 '22 10:09 pcmanus

Anyway, starting to wonder if the benefit of being a tad more optimal on the somewhat useless query asking only for __typename really outweigh the cost of adding quite a bit of complexity to the execution side ...

Continuing to thing about this issue, I'm realising that there is actually another more important reason why this has been historically pushed onto the gateway to handle.

And that is due to how federation handle "custom" root types (by which I mean, root types with non-default names, say MyQuery instead of Query). Namely, federation accepts that subgraphs use custom root types, but "renames" them to their default type before composition. Which means that subgraph may use custom root types (and even different custom root types in different subgraphs) but the supergraph will always use the default root type names.

In practice, this mean that subgraph actually cannot be trusted to return the proper value for top-level __typename (because they may actually not have the name we want in the subgraph).

So really, this cannot be pushed to the query planner easily and has to be handled by the router.

But thinking about the difficulty I mentioned above, I'm realising that those aren't new problems. At the end of the day, all that the router needs to decide is if the top-level __typename is truly requested by the top-level query. And well, that's something the router has to know how to do anyway since there is plenty of cases where query planning execution fill fetch fields that should not be returned in the final response (typically, @key fields or @requires fields that needs fetching for the sake of being send to other subgraphs but aren't queried otherwise). I believe that's what the router calls "response shaping".

In any case, that mean that hopefully this actually isn't hard to implement in the router. That is, if would seem that all that the router needs to do is that after the query plan has been executed, and before running the "response shaping" code, then the router could unconditionally "inject" the top-level __typename field to the in-memory response (for which, as explain above, you don't even need to check the schema, since the typename is guaranteed to be a default name, so either "Query" or "Mutation"). Then, when the "response shaping" code runs, it should ensure that this __typename is only included in the response if it should.

Side-note: it could be that what I'm suggesting above actually doesn't quite work for @defer, because I assume that for @defer, the "response shaping" code for each chunk uses the subselection field of DeferNode/DeferredNode, and that current doesn't have top-level __typename. Assuming that's true, that part would have to change in the query planner (but that is relatively simple).

pcmanus avatar Sep 07 '22 11:09 pcmanus

And to continue on my monologue here: there is actually one more subtlety I initially didn't realise. As explained above, subgraphs cannot be trusted with the __typename of root types. That's an issue for a top-level __typename, but not only unfortunately. Because root type name can well be returned from fields too. So you can have a subgraph with:

schema {
  query: MyQuery
}

type MyQuery {
  a: MyQuery
   b: Int
}

but if you federate that and send:

{
  a {
    __typename
    b
  }
}

then that __typename will be send to the subgraph, and it will return MyQuery, but the router should rewrite that into Query (turns out the gateway, through the graphQL-js execution, does do that somehow, which surprised me but ...). But doing so does get rather more subtle in general.

So tl;dr, I'm not sure how to simply solve all the corner cases around the issue of the __typename of root types for the router. Not to mention that there is related existing issues even with the gateway currently: https://github.com/apollographql/federation/issues/958. I think an idea could be to have the query plan include some simple (but somewhat) "rewriting" instructions that the router could apply either to the data to send to a subgraph or to the data received back, but this is a more involved solution that needs discussion.

pcmanus avatar Sep 07 '22 15:09 pcmanus

@pcmanus Just want to say I very much appreciate all of your monologue and careful thought here. 😸

It sounds like we could find some smaller steps forward which incrementally improve some aspects, but — as you say — that we'll need to find some time to define a fully developed solution.

abernix avatar Sep 07 '22 16:09 abernix

Just curious: @martinbonnin / @alessbell - how would this impact a client developer? Is there any specific thing we should be documenting about this case?

jpvajda avatar Sep 07 '22 19:09 jpvajda

Adding the root level __typename manually then letting response shaping do its job as you proposed would address most of the needs here. And what would be needed for field level is to check that the type name is the expected one and replace it? Are there cases where we actually want to return an error on that?

Geal avatar Oct 25 '22 08:10 Geal

And what would be needed for field level is to check that the type name is the expected one and replace it?

I believe doing so (check the type name and replace it if necessary) would work, yes, if it's not too much work during response shaping.

Though as mentioned above, there is some more involved problems that are due to this "query type renaming" we do in subgraphs, so we'll probably have to do more on this later. Fwiw, this is the kind of problems that I'm hoping to eventually solve with the "rewrites" additions to query plans I'm suggesting in that doc. That is, I hope that once we have some semi-generic rewriting capability in the query plans that router/gateway support, then this kind of 'need-to-rename-something-at-runtime' can be handled on the side of the query planner without need change in the router each time. But anyway, that's for later, just rumbling a bit, and for now "fixing" the query type name during response shaping would solve the most pressing concerns just fine.

Are there cases where we actually want to return an error on that?

Did you have something specific in mind? I'd say in general we'll want this to work properly rather than return an error if we can, but you may have cases in mind I'm not thinking of.

pcmanus avatar Oct 26 '22 09:10 pcmanus

fixed by #2253

Geal avatar Dec 13 '22 09:12 Geal