federation Bug (in 0.x versions) when handling result set with the same field requested multiple time with different @skip/@include

I have a quick question when executing this query:

query {
  me {
    id @skip(if: true)
    id
  }
}

I have this response from the gateway:

{
  "data": {
    "me": null
  }
}

but if I execute this query now:

query {
  me {
    id @skip(if: true)
  }
}

I receive:

{
  "data": {
    "me": {}
  }
}

So my main question is what is the rule here ? When should I receive an empty object ? When should I receive a null ? I think it misses some consistency, I'm also wondering what is the spec about this kind of behavior ? When do we transform an empty object into null ?

Apr 27 '22 16:04 bnjjj

It might help to explain this to know the schema and details regarding the subgraphs (are they functioning properly and what are their resolvers in particular).

Because I'm not entirely certain how the gateway could give you those result in the first place. More precisely, I can almost explain it through some guesses, but I'd expect the first response to include some "errors" array too. Unless you're hitting a case where an error happens during execution, but the gateway somehow doesn't record it in the response, but it'd feel like a bug. Otherwise, my guesses are wrong and I could use additional detail.

But in case that helps, I'll note that @skip and @include are handled pretty early in the graphQL execution (in the spec, it's in collectFields which is the first step of executing a given selection set). So really, the 2 queries above are equivalent to doing:

query {
  me {
    id
  }
}

in the first case, and:

query {
  me {
  }
}

in the second case. I will note that this latter query is not technically syntactically valid, it doesn't parse, but well, it's still essentially how one should answer a query where everything is "skipped" in terms of execution.

Now, taking the latter one, assuming the me resolver returns a non-null object, any object, then execution will stop there, and an empty object is returned, since nothing is being asked. Not in particular that the resolver for id is not called. I'll note that if me either errors out or return null, then you will get me: null in the result, not an empty object (it does "on my machine" at least :)).

In the first case, execution also calls the me resolver, and assuming it get a non-null object in that case too, then it will call the resolver for id (passing the object returned from me). Now, if id is nullable, then I don't see how you could get me: null there, you should get me: { id: null }, so I'm guessing that id has a non-nullable type and that resolving that field somehow either returns null or errors out. If that's the case, then because id is non-nullable, then null will "bubble up" (https://spec.graphql.org/June2018/#sec-Errors-and-Non-Nullability) and you will get me: null as in your first response. However, and that's the part that don't fully match, you should also get an error recorded in the response in theory.

Anyway, let me know if any of this help, or if my guesses are completely off.

Apr 27 '22 19:04 pcmanus

Ok after some investigations, the behavior of this query is different if we are in fed1 and fed2. Keep in mind that there is a notion of precedence in the fed1 implementation of the gateway. For example the first duplicated field will always win, if it starts with a @skip(if: true) then it will always be skipped.

Apr 28 '22 15:04 bnjjj

Interesting action could be to mention or document this kind of uncompatibility here https://www.apollographql.com/docs/federation/federation-2/backward-compatibility what do you think ?

Apr 29 '22 08:04 bnjjj

Right. So to provide context to other readers, the issue is that federation 1 has a bug when it goes to handling queries like:

me {
  id @skip(if: true)
  id
}

Namely, as I mentioned in my previous comment, the spec essentially says that this should be equivalent to:

me {
  id
}

because fields with a @skip whose condition is true are ignored by collectFields.

But unfortunately, for that query (with the @skip), the federation 1 query planner sends to the subgraph:

me {
  id @skip(if: true)
}

which is clearly wrong. From a super quick look, this appears due to this, which is wrong in that case, but I'm not quite sure how easy it is to fix (I've never be that comfortable with the fed1 planner code and any familiarity tends to be fading).

That bug is what lead to the confusion on this issue. And indeed, this bug makes behaviour confusing and inconsistent. For instead, it means that:

me {
  id @skip(if: true)
  id
}

and

me {
  id
  id @skip(if: true)
}

behaves completely differently, which feels ... unexpected.

Anyway, the good news is that this is fixed in federation 2. But it does mean that effectively fed1 and fed2 differs on that behaviour.

I'm not sure how much should try to fix this bug in federation 1 at this point. At the end of the day, this case is a bit artificial in the sense that a user is unlikely to do such queries in practice. Of course, fragments usage could more easily lead to this, but you also need @skip/@include involved. Which is probably why nobody reported this before btw.

Interesting action could be to mention or document this kind of uncompatibility here https://www.apollographql.com/docs/federation/federation-2/backward-compatibility what do you think ?

Maybe. Though, as said above, it's really a bug (and not a super applicable one probably), and we haven't so far listed bug fixed by fed 2 on that page (granted, bug fix can often technically be considered as backward incompatibility but ...).

I definitively think we should keep this issue open so it's more discoverable (and in case we want to consider fixing it) but I'd personally leave it at that. Open to other opinions though.

Apr 29 '22 16:04 pcmanus

federation federation copied to clipboard

Bug (in 0.x versions) when handling result set with the same field requested multiple time with different @skip/@include

federation
federation copied to clipboard