graphql-spec icon indicating copy to clipboard operation
graphql-spec copied to clipboard

[RFC] Tagged type

Open benjie opened this issue 4 years ago • 39 comments

THIS RFC HAS BEEN SUPERSEDED by @oneof, for now at least... See: https://github.com/graphql/graphql-spec/pull/825


This is an RFC for a new "Tagged type" to be added to GraphQL. It replaces the "@oneField directive" proposal following feedback from the Input Unions Working Group. Please note that "Tagged type" is the working name, and may change if we come up with a better name for it.

A Tagged type defines a list of named members each with an associated type (like the fields in Object types and Input Object types), but differs from Object types and Input Object types in that exactly one of those members must be present.

The aim of the Tagged type is to introduce a form of polymorphism in GraphQL that can be symmetric between input and output. In output, it can generally be used as an alternative to Union (the differences will be outlined below). It goes beyond interfaces and unions in that it allows the same type to be specified more than once, which is particularly useful to represent filters such as this pseudocode {greaterThan: Int} | {lessThan: Int}.

If merged, Tagged would be the first non-leaf type kind (i.e. not a Scalar, not an Enum) that could be valid in both input and output. It is also the first kind of type where types of that kind may have different input/output suitability.

In SDL, a tagged type could look like one of these:

# suitable for input and output:
tagged StringFilter {
  contains: String!
  lengthAtLeast: Int!
  lengthAtMost: Int!
}

# output only:
tagged Pet {
  cat: Cat!
  dog: Dog!
  colony: ColonyType!
}

# input only:
tagged PetInput {
  cat: CatInput!
  dog: DogInput!
  colony: ColonyType!
}

(Note a number of alternative syntaxes were mooted by the Input Unions working group; the one above was chosen to be the preferred syntax.)

If we queried a StringFilter with the following selection set:

{
  contains
  lengthAtLeast
  lengthAtMost
}

then this could yield one of the following objects:

  • { "contains": "Awesome" }
  • { "lengthAtLeast": 3 }
  • { "lengthAtMost": 42 }

Note that each of these objects specify exactly one key.

Similarly the above JSON objects would be valid input values for the StringFilter where it was used as an input.

Tagged vs Union for output

Tagged does not replace Union; there are things that Union can do that tagged cannot:

{
  myUnionField {
    ... on Node {
      id # If the concrete type returned by `myUnionField` implements
         # the `Node` interface, we can query `id`.
    }
  }
}

And things that Tagged can do that Union cannot:

tagged Filter {
  equalTo: Int!
  lessThan: Int!
  greaterThan: Int!
  isNull: Boolean!
}

Tagged allows for exploring the various polymorphic outputs without requiring fragments:

{
  pets {
    cat { name numberOfLives }
    dog { name breed }
    parrot { name favouritePhrase }
  }
}

When carefully designed and queried, the data output by a tagged output could also be usable as input to another (or the same, if it's suitable for both input and output) tagged input, giving polymorphic symmetry to your schema.

Nullability

Tagged is designed in the way that it is so that it may leverage the existing field logic relating to nullability and errors. In particular, if you had a schema such as:

type Query {
  pets: [Pet]
}

tagged Pet {
  cat: Cat
  dog: Dog
}

type Cat {
  id: ID!
  name: String!
  numberOfLives: Int
}

type Dog {
  id: ID!
  name: String!
  breed: String
}

and you issued the following query:

{
  pets {
    cat { id name numberOfLives }
    dog { id name breed }
  }
}

and for some reason the name field on Cat were to throw, the the result might come out as:

{
  "data": {
    "pets": [
      { "cat": null },
      { "dog": { "id": "BUSTER", "name": "Buster" } }
    ]
  },
  "errors": [{ ... }]
}

where we can tell an error occurred and the result would have been a Cat but something went wrong. This may potentially be useful, particularly for debugging, compared to returning "pets": null or "pets": [null, {"dog": {...}}]. It also makes implementation easier because it's the same algorithm as for object field return types.

FAQ

Can a tagged type be part of a union?

Not as currently specified.

Can a tagged type implement an interface?

No.

What does __typename return?

It returns the name of the tagged type. (This is a new behaviour, previously __typename would always return the name of an object type, but now we have two concrete composite output types.)

What happens if I don't request the relevant tagged member?

You'll receive an empty object. For example if you issue the selection set { cat } against the tagged type below, but the result is a dog, you'll receive {}.

tagged Animal {
  cat: Cat
  dog: Dog
}

How can I determine which field would have been returned without specifying all fields?

There is currently no way of finding out what the field should have been other than querying every field; however there's room to solve this later with an introspection field like __typename (e.g. __membername) should this show sufficient utility.

Open questions

  • Should we add isInputType / isOutputType to __Type for introspection? [Author opinion: separate RFC.]
  • Should we use TAGGED_INPUT and TAGGED_OUTPUT types separately, rather than sharing just one type? [Author opinion: no.]
  • Should we prevent field aliases? [Author opinion: no.]
  • What exactly should the input coercion rules be, particularly around variables being omitted, e.g. {a: $a, b: $b} [Author opinion: as currently specified.]

benjie avatar Jun 12 '20 16:06 benjie

Great job @benjie, this looks really solid overall.

spawnia avatar Jul 20 '20 07:07 spawnia

All but one piece of feedback addressed. Will ponder the final point. Thanks @spawnia :raised_hands:

Changing from WIP to ready for review, and adding it to the next GraphQL Spec WG :+1:

benjie avatar Jul 21 '20 16:07 benjie

Great work @benjie and everybody else involved. The two fundamental questions that come to my mind:

  • What is the impact of this new type on the underlying GraphQL structure? What I mean with that is: does is make static analysis harder (or easier)? See https://www.graphql.de/blog/static-query-analysis/ for some background. Does it make certain advanced topics like query rewriting harder or maybe even impossible? What is the impact on validation in terms of complexity?

  • What is the impact of schema design? Are we making it harder or easier to design schemas? What can we do create Pits of success?

These questions are relevant to all type systems changes I believe.

andimarek avatar Aug 06 '20 19:08 andimarek

@andimarek Excellent questions, I agree they're very relevant.

  • What is the impact of this new type on the underlying GraphQL structure? What I mean with that is: does is make static analysis harder (or easier)? See https://www.graphql.de/blog/static-query-analysis/ for some background. Does it make certain advanced topics like query rewriting harder or maybe even impossible? What is the impact on validation in terms of complexity?

I think from a fundamentals point of view, this shouldn't make static analysis or query rewriting any different. In most ways a Tagged type operates equivalently to an Object type and provides the same level of static knowledge of types.

Some cases we should want to stress test are the edge cases and corner cases of using fragment spreads along with the additional constraints Tagged types offer. That will inform how strict we need to be about these constraints.

For example, we have an implicit rule that a fragment spread should always be able to be inlined without changing meaning or validity. eg { ...Frag } fragment Frag on T { name } is equivalent to { name } (provided the type condition always applies). The area that deserves some careful consideration is aliased fields. For example, on a tagged type, is { alias1: name, alias2: name } legal? If not, then are fragment Frag1 on Tagged { alias1: name } and fragment Frag2 on Tagged { alias2: name } separately legal? And if so is { ...Frag1, ...Frag2 } legal on a tagged type Tagged since both fragment types apply?

Can you think of other scenarios which could pose problems for static analysis or query rewriting?

  • What is the impact of schema design? Are we making it harder or easier to design schemas? What can we do create Pits of success?

This is an excellent question because this moves counter to the good design principle that there should be one way to model something. That's not a principle explicitly listed but it's still a good one. Now when you consider a case that you have a field which can return one of many options, we're expanding your choices from 2 (interface, object) to 3 (+ tagged). There was already pushback during original design years ago from @schrockn that perhaps there should be only one (interface) where unions are modeled as interfaces which have no fields. There are other modeling tradeoffs there, but the benefit was ensuring only one way to do abstract types - potentially creating a pit of success.

Something critical we'll need to do is well document the tradeoffs for when you should consider using a Union vs Tagged type (or Interface), and in fact if we can be terse and clear about it that would be quite valuable to include within the spec itself so there's a canonical explanation of why multiple options exist.

Can you see other schema design pitfalls?

leebyron avatar Aug 06 '20 19:08 leebyron

Thanks for the review @leebyron!

I've :+1:'d the input I agree with without hesitation. Some others need discussion, or I need to think about more. I'm unlikely to have time in August to implement the suggestions, but hopefully will have more time in September - I welcome others' feedback in the mean time :+1:

For example, on a tagged type, is { alias1: name, alias2: name } legal

I currently propose that this is definitely legal, it's allowed in the RFC as is, I see no reason to forbid it, and it simplifes the concerns you raise :+1:

The main reason for this is that { name __typename } is valid, so already you may have more than one key returned.

If we chose not to allow { alias1: name, alias2: name } then it would be done, IMO, by forbidding aliases on tagged types, which also solves the {...Frag1, ...Frag2} issue.

benjie avatar Aug 06 '20 19:08 benjie

Thanks for putting this together!

My one concern here is in that we're allowing tagged to exist in the output type system as well; I think that might lead to some interesting and not entirely desirable downstream effects. There was discussion above about how to decide between tagged and union in designing a schema, but my worry that by allowing tagged to exist in both, we might create some tricky cases where it's not suitable for either.

For example, I can write an "input union", but then discover that my type got accidentally restricted by someone putting it in the output type set.

i.e. if I write an input union

tagged Filter {
  startsWith: String
  endsWith: String
}

and then come back two days later to add an optional input object parameter:

tagged Filter {
  startsWith: String
  endsWith: String
  range: Range
}

input Range {
  start: Int
  end: Int
}

I might not be able to, if someone decided to include Filter as a field in an object type in the meantime -- because that would mean Filter is now in both input and output, and hence can only use things in both schemas, and input objects can't be in the output schema. Similarly, if I write a tagged intending it for use in the output type system, but then it gets used somewhere else in the input type system, I can no longer add an object/interface/union field to that tagged.

That is to say: right now, there's no way for any type to "change" its input/output availability. Scalars and Enums are always "both"; input types are always "input only"; objects types, interfaces, and unions are always "output only". But tagged as proposed today can do so -- it can start out as "both", but then when someone adds an input to it, it becomes "input-only", or when someone adds an interface to it, it becomes "output-only". And that mutability can cascade through the schema, because if a type changes from "both" to "input-only", then every type that recursively references that type via fields also must become "input-only"... which might not be possible.

As I understand it, the impetus for this proposal was to enable input unions. If the solution for input unions naturally extended to the output types, then it might make sense to include that extension for cleanliness, and it seems like that was the intention here. In this case, though, I think adding output type system support might add complexity, and I don't immediately see the benefit that it yields since we already have unions in the output type system.

So I'd consider restricting tagged to just the input type system (i.e. both tagged and input can only contain fields of tagged, input, enum and scalar, and just as input cannot appear in the output type system, neither can tagged). That seems to solve the problem of input unions, and it also creates a nice parallelism between the relationship between type and union with input and tagged` -- the former in the output type system, the latter in the input type system.

dschafer avatar Aug 07 '20 01:08 dschafer

If merged, Tagged would be the first non-leaf type kind (i.e. not a Scalar, not an Enum) that could be valid in both input and output. It is also the first kind of type where types of that kind may have different input/output suitability.

I do think this is an interesting problem in general -- the desire to have a type that can be shared between input and output makes a lot of sense to me. My concern is more around trying to solve that problem at the same time that we're solving the "input unions" problem, when those feel like separate issues to me.

dschafer avatar Aug 07 '20 01:08 dschafer

@dschafer This is my biggest concern too, as discussed on the WG. However tagged brings “unions” of objects, unions, interfaces, scalars, enums and lists to the output schema, whereas GraphQLUnion only allows for objects, so I think it definitely has value there (see e.g. #215).

As mentioned on the WG, I think an alternative solution to your specific issue would be a schema validation rule stating that a specific tagged type may only be used in an input context or in an output context but not both. This validation rule could be dropped in future if we find value in using the same tagged in both contexts. That said, I trust users to make this decision for themselves and implement the restriction in their own tooling should they need it.

benjie avatar Aug 07 '20 06:08 benjie

The fact that this solves the "output union contains scalar" problem is indeed a nice feature, and makes the case for including these in the output type system. I sorta see three separate discussions here: should tagged be allowed in the input schema, should it be allowed in the output schema, and should it be allowed in both at the same time?

I'm broadly supportive of adding this to the input type system; it allows for input unions, and has very little downside that I see.

Adding this to the output type system I think has a different set of tradeoffs. It solves the scalar-in-union problem, but at the cost of introducing duplication between union and tagged; the Venn diagrams of "when to use both" doesn't overlap completely, but it overlaps a lot. I don't have a sense for how bad the "scalar-in-union" problem is (how bad is the "just make a wrapper object" workaround, for example?), and I'd have a bias towards simplicity which would have me default to "leave it out"... but I can certainly see the case here.

The ability to add this to "both type systems at once" is where I see significant trade offs. It feels to me like the upside is limited, mostly because there still isn't a base object type that can be shared between both systems. So the only time a tagged can actively be shared between input and output right now is if it contains only enums and scalars; if it contains an type then it's output only, and if it contains an input then it's input only. So adding the ability to use it in both systems feels like a pretty limited win.... but with the added complexity of having "transmutable" types now.

To be more concrete about my concerns about transmutability:

tagged T {
  a: Int
-  b: Int!
+  b: Int
}

Is that a breaking change? As far as I can tell, the answer is maybe. If T is used anywhere in the output schema, then it is a breaking change -- you can't change a non-nullable field to be nullable, because old clients don't handle null. If T is used only in the input schema, then it's fine; you can take an input field and make it optional without risk. This is certainly a tooling-complete problem... but it does add a substantial amount of complexity, and the capability that we're making that tradeoff to get -- the ability to share a tagged that only has scalar and enum fields between input and output -- feels limited.

dschafer avatar Aug 07 '20 15:08 dschafer

I don't normally weigh in but @leebyron invoked my name!

This is a really interesting proposal. First of all, I think it is a very elegant way to support input unions. Awesome work.

Like @dschafer I'm concerned about the input/output type system commingling, as well as the "there are now two ways to do this" issue in the output type system.

I think the core issue is what dan terms "transmutability". Most obviously the output system allows for parameters whereas the input system does not. More subtly, the output system has covariant types, and the input system has contravariant types. An output move on the server from Int! to Int is safe w/r/t to the client. The converse is safe in the input side on server: from Int to Int!. If covariance and contravariance were ever more deeply expressed in interfaces etc, the interaction with tagged could become pernicious and limit the future design space.

schrockn avatar Aug 07 '20 16:08 schrockn

great points @dschafer @schrockn.

@benjie I would recommend to just focus on the initial goal to provide an input union equivalent.

Even if we overcome (or just deicide on) the issues regarding more complicated validation or the fragments question raised by @leebyron I still fail to see the convincing value of having another output type.

From a practical schema design POV I can attest that there is already a struggle to decide between union or interfaces. Having a third option will make this even harder.

andimarek avatar Aug 07 '20 23:08 andimarek

I want to highlight that even if we decide to scope this down to the input case only, the exercise of detailing out how such a type would support both environments is still incredibly valuable. It's really helped us be more thoughtful about the various parts of the spec that it comes in contact with. It also ensures we retain option value for future where more types can be shared between the input and output domain.

Dan and Nick make great points, let's make sure that feedback is incorporated.

One wild idea to explore to address these "transmutability" concerns is to be explicit about whether a type is intended to be used in the input, output, or both domains, rather than this be implicit based on crawling the member types. I don't have any decent grammar or terminology suggestions for you, but I can imagine something like an "input tagged", "output tagged", or "shared tagged" declared at the type definition making this sort of thing clear and avoiding the type designers hands being tied on evolution based on a type consumer using the type in a way it was not intended for.

leebyron avatar Aug 10 '20 21:08 leebyron

@benjie @leebyron Looking at the input union RFC this solution is based on on the oneOf directive idea if I am not mistaken.

Can you explain why you decided to go a step further and introduce a whole new type for input and output instead of the more simpler oneOf directive?

I see the one failed criteria for the oneOf directive is Input polymorphism matches output polymorphism. Was that the reason?

thanks

andimarek avatar Aug 10 '20 23:08 andimarek

I've edited the spec to address a lot of this feedback, but there's still a lot left to address. Leave it with me, I'll be back to it in a few weeks :+1:

benjie avatar Sep 02 '20 10:09 benjie

Hi.

Just my 2 cents on the "should this be for input only, output only, mixed" debate, based purely on my experience with generating and using GraphQL schemas:

We don't really care about the mixed (shared) case :). Having taggedInput (unionInput) and taggedType would be enough. Technically speaking, taggedType we don't even need necessarily, since we already have union :)

Due to the fact nobody in their right mind will actually be generating the schema manually (there's just so many tools for so many languages that will do this for your, when all you need is some kind of plain old Java (whatever) object or dataclass or ... -- I don't think that that there is actually a real big need to try to cover these scenarios.

We already are used to having a separate input and output types. (We give them different names, because that allows us to change them individually -- one example would be something like a server generated default -- in the input you don't have to provide it at all, but it will always be part of the output.)

The thing that even lead me here is really the missing union (or union-like) type for input. We can very nicely utilize multiple different types for different kinds of objects and even states (some states allow different mutations than others) when it comes to output, but we cannot easily do this for the input, while keeping it somewhat type safe.

The "best" (questionably) seems to be these builder-like approaches:

mutation {
  animal {
    create {
      aDog(...) {
        __typename # will be Dog (nothing else)
      }
      aCat(...) {
        __typename # will be Cat (nothing else)
      }
    }
  }
}

(If we have a special kind of dog or cat, we add one more level to the tree), but it's always the leaves that do stuff. We initially had the "shared" information about an animal in the first call, but turns out, it's actually easier on the client to just feed it all at once to the leaf. And honestly, your domain model grows, and sometimes even the shared stuff gets slightly different meaning over time, so acting as if nothing is really shared makes it easier long term.)

A similar approach that we've used for when you have a collection in which you want to create something and the collection dictates the something(s) you can create:

mutation {
  file {
    create {
      onServer(...) { # you provide a server identifier, the thing fetches what file system paths the server supports
        ... on CreateFileOnPosixServerMutationsType {
          atPath(path: PosixPathInput) {__typename # gives you back FileOnPosixServerType}
        }
        ... on CreateFileOnWindowsServerMutationsType {
          atPath(path: WindowsPathInput) {__typename # gives you back FileOnWindowsServerType}
        }
      }
    }
  }
}

TLDR: If corners need to be cut, please consider cutting all but taggedInput.

BTW: Does anyone know if there's any actual work done out there on deprecating input fields? :)

Cheers

petr-motejlek avatar Sep 20 '20 11:09 petr-motejlek

Thanks for the feedback @eapache :raised_hands:

benjie avatar Sep 25 '20 08:09 benjie

Grab yourself a cup of tea; this is a long one.

I've thanked a few of you for your input already, but I wanted to especially thank @eapache, @leebyron and @dschafer for their extensive and well-reasoned input. I've done my best to factor this in to the feedback below.

Taggeds valid for both input and output

my worry that by allowing tagged to exist in both, we might create some tricky cases where it's not suitable for either. -- @dschafer https://github.com/graphql/graphql-spec/pull/733#issuecomment-670268694

Dan notes that if you create an input-output tagged and use it in output, and then someone else uses it in input, now it's forced to be input-output forevermore and you can no longer add output-only fields to it.

the ability to use [tagged types] in both [input and output] feels like a pretty limited win.... but with the added complexity of having "transmutable" types now. [Is the following] a breaking change? As far as I can tell, the answer is maybe.

tagged T {
  a: Int
-  b: Int!
+  b: Int
}

-- @dschafer, edited by @benjie, original comment: https://github.com/graphql/graphql-spec/pull/733#issuecomment-670585811

Here Dan notes that having an in/out tagged makes it hard to spot what changes are breaking, changes to the type are no longer "local" and require knowledge of the entire type system to determine if they are breaking or not.

Others have also made similar points.

The general consensus seems to be that we can solve this by marking tagged as input/output at design time so we can never get into this situation. Most everyone further seems to agree that there's little need for an input-output tagged that can only contain scalars, enums, and other input-output taggeds; and I agree. I'm happy to edit the spec to reflect this (though I'd like to discuss the specific SDL syntax for this) and it ties in nicely with my isOutputType / isInputType introspection additions proposal.

One wild idea to explore to address these "transmutability" concerns is to be explicit about whether a type is intended to be used in the input, output, or both domains, rather than this be implicit based on crawling the member types. -- @leebyron https://github.com/graphql/graphql-spec/pull/733#issuecomment-671597237

Solution:

tagged input Foo {
  str: String!
  int: Int!
}
tagged output Foo {
  str: String!
  int: Int!
}
# No "shared" tagged, at least not currently; we could always add one later.

(I'm quite tempted to use tagged type rather than tagged output to align with the current input/type keywords.)

Why include tagged output?

It seems (not unexpectedly) that the most controversial part of this RFC is whether tagged types should be allowed in the output schema. To help address this, I thought it would be worthwhile outlining a few reasons why we might want to include tagged in the output schema.

Scalar "unions"

@stubailo proposed allowing unions of Int/String in #215 and it has over a hundred :+1:s and a decent number of replies. Tagged outputs can be used to solve the issue of having a field return one of many different scalar types, and in fact can solve most of the other issues raised in that thread.

This "union of scalars" power is really just a subset of:

"Unions" of all output types (including wrapper types)

tagged brings “unions” of objects, unions, interfaces, scalars, enums and lists to the output schema, whereas GraphQLUnion only allows for objects -- @benjie https://github.com/graphql/graphql-spec/pull/733#issuecomment-670357999

The fact that this solves the "output union contains scalar" problem is indeed a nice feature, and makes the case for including these in the output type system. @dschafer https://github.com/graphql/graphql-spec/pull/733#issuecomment-670585811

Whereas the union type can only include object types as members, tagged outputs support:

  • object types
  • union types
  • interface types
  • tagged output types
  • scalars
  • enums
  • lists (or other wrapper types) of any of the above

The ability of tagged outputs to bring this flexibility to the output schema is particularly relevant to polymorphic data sources such as MongoDB, ElasticSearch, Apache Avro, anything based on JSON, and more where the same field could return different data types for different records. This kind of polymorphic interface is even seen in popular REST APIs such as Stripe's.

Here's a few of the issues related to this desire that I found:

  • https://github.com/graphql/graphql-spec/issues/215
  • https://github.com/graphql/graphql-spec/issues/489
  • https://github.com/graphql/graphql-spec/issues/202 (relates to input conditions, but in query-builder (report generation) style applications, input conditions are often also expressed via outputs to allow display/modification)

Union could now contain scalars/enums via Tagged

NOTE: this feature does not exist in the RFC as currently written, but we'd get this if we allowed a tagged output to be a member of a Union (I don't see any barriers to this).

Union types currently only allow unions of object types, but we could change this to allow them to contain tagged types too; this would allow for scalars to be part of unions.

The below schema comes from my comment here:

type City {
  id: ID!
  name: String!
  coordinates: Coordinates!
}

type Town {
  id: ID!
  name: String!
  coordinates: Coordinates!
}

type POI {
  id: ID!
  name: String!
  coordinates: Coordinates!
  nearestCity: City!
}

type Coordinates {
  latitude: Float!
  longitude: Float!
}

tagged LocationSpecifier {
  locationName: String!
  locationCoordinates: Coordinates!
}

union Place = City | Town | POI | LocationSpecifier;

type Query {
  myFavoritePlace: Place
}

In this schema, your favourite place could be a known city, town, point of interest, a custom "named place", or a custom position on the globe.

Symmetry

the desire to have a type that can be shared between input and output makes a lot of sense to me -- @dschafer https://github.com/graphql/graphql-spec/pull/733#issuecomment-670272174

It's important to note that I am not proposing that the same polymorphic type can be used for both input and output, but that you can write a pair of input/output taggeds that can be symmetric in a similar way that you can make symmetric type/input types. For example:

type User {
  id: ID!
  name: String!
  email: String
  friends: [User!]
}

input UserInput {
  id: ID!
  name: String!
  email: String
}

# Imagine there's similar to User/UserInput for Organization/OrganizationInput

tagged input EntityInput {
  user: UserInput!
  organization: OrganizationInput!
}

tagged output EntityOutput {
  user: User!
  organization: Organization!
}

With this schema, you could write a query against an EntityOutput, the output of which you could feed straight back into an EntityInput with no manipulation required; e.g.

fragment FetchEntity on Entity {
  user {
    id
    name
    email
  }
  organization {
    # ...
  }
}

This is not a perfect solution though; there are caveats:

  • EITHER the query must not fetch fields that are not valid on the input, OR the client must filter these out. For example User.friends has no equivalent in UserInput, and __typename is not valid on input.
  • When evolving the schema by adding additional member fields to the two taggeds, it's possible the fetched object for an existing operation may result in {} which is no longer valid input to EntityInput

When would you use a Union vs Tagged Type vs Interface?

if we can be terse and clear about it that would be quite valuable to include within the spec itself so there's a canonical explanation of why multiple options exist. -- @leebyron https://github.com/graphql/graphql-spec/pull/733#issuecomment-670153936

at the cost of introducing duplication between union and tagged; the Venn diagrams of "when to use both" doesn't overlap completely, but it overlaps a lot -- @dschafer https://github.com/graphql/graphql-spec/pull/733#issuecomment-670585811

I'm concerned about the input/output type system commingling, as well as the "there are now two ways to do this" issue in the output type system. [...] the core issue is what dan terms "transmutability" [...] the output system has covariant types, and the input system has contravariant types. -- @schrockn https://github.com/graphql/graphql-spec/pull/733#issuecomment-670598888

there is already a struggle to decide between union or interfaces. Having a third option will make this even harder. -- @andimarek https://github.com/graphql/graphql-spec/pull/733#issuecomment-670779388

The rule of thumb for using interface remain unchanged. I'm somewhat sleep deprived right now, but I'd say the rule is something like: if a field could resolve to one of a set of object types, and it would be helpful for your clients to have access to certain common fields on these types independent of what underlying type they were, use an interface. (Example: searching products in a supermarket, common useful fields would be name, description and price.)

So ignoring interfaces, now comes the time to choose between unions and tagged types.

~~Just use tagged types, they're more powerful and we can deprecate unions.~~ 😜

Unions have three main advantages:

  1. They don't involve wrapper objects, and are thus "flatter"
  2. You can tell what the type would have been (via __typename) even if you didn't have a fragment on that type
  3. You can spread an interface fragment over them, e.g. ... on Node { id }, if any of the members implement that interface

Tagged outputs have advantages too:

  1. They can represent all output types, not just object types
  2. They can be symmetric with tagged inputs
  3. Client types are arguably simpler - rather than being one of many types, differentiated by a __typename entry, they are just properties that either exist or don't
  4. Rather than implementing ResolveAbstractType (aka resolveType/isTypeOf in GraphQL.js), the field indicates concretely the returned type via wrapper object key
  5. They're easier for GraphQL newbies to query: no need for fragments, just normal selection sets and auto-complete:
{
  pets {    # a tagged output
    cat {
      name
      numberOfLives
    }
    dog {
      name
      breed
    }
    parrot {
      name
      favouritePhrase
    }
  }
}

I'd follow a decision tree like this (assuming polymorphism was required):

  • Will your field resolve to one of a set of object types, and would it be helpful for your clients to have access to certain common fields on these types independent of what underlying type they were? If yes, use interface.
  • Along with object types, might your field ever want to return a scalar, enum, or list type? Use tagged output.
  • Would you like your polymorphic output type to match a polymorphic input type? Use tagged output.
  • Are you exposing your GraphQL schema directly to less-technical users who might find the fragment syntax confusing? Consider using tagged output.
  • Otherwise, maybe use union?

Why a whole new type?

this solution is based on on the oneOf directive idea if I am not mistaken. Can you explain why you decided to go a step further and introduce a whole new type for input and output instead of the more simpler oneOf directive? I see the one failed criteria for the oneOf directive is Input polymorphism matches output polymorphism. Was that the reason? @andimarek https://github.com/graphql/graphql-spec/pull/733#issuecomment-671632036

I'm still not 100% sold on this myself, the implementation of a @oneOf directive would be much simpler both in terms of spec edits and in terms of adding support to GraphQL implementations. That said, the behaviour of tagged is quite different to that of the existing types so using a new type for them allows us to keep the option value - we don't need to factor in @oneOf when adding features to type/input and vice-versa.

Differences between type Foo @oneOf vs type Foo:

  • Fields may not have arguments
  • The shape of the result differs without the client stating such (e.g. without using @skip/@include)
  • Foo may not be a member of an interface
  • It's likely that a different set of directives would make sense on @oneOf members vs regular types.

Differences between input FooInput @oneOf vs input FooInput:

  • Even if the fields are non-nullable, exactly one must be specified
  • Even if the fields are all nullable, exactly one must be specified
  • It's likely that a different set of directives would make sense on @oneOf members vs regular inputs.

It was also felt that a dedicated type makes the functionality seem more attractive, rather than a "hack". Admittedly this could be addressed by using an alternative syntax, e.g. oneOf Foo/oneOfInput FooInput instead of type Foo @oneOf/input FooInput @oneOf.

Another concern was that of legacy clients: if their introspection queries were not requesting the requiresExactlyOneField property then they would not know about the new functionality affecting these types.

I feel like I've bungled the answer on this one; @leebyron was much more convincing, perhaps he could lay out the reasoning he used on me!

benjie avatar Sep 30 '20 19:09 benjie

I took the time to read this really long and interesting thread and I would like to comment that i feel that one of the main advantages of graphql is its declarative query language. I feel that with tagged types that can return only on the n requested fields we might lose some of the predictability of the response. One could argue that there are already querys and mutations with this behavior when we use fragments on interfaces, or when we ask for more details about a union member. But i feel like those cases are explicit with the fragment syntax. With tagged types you could get only one field from the three requested, with no hint to that behavior on the query itself.

Just my 2 cents

Ambro17 avatar Nov 01 '20 16:11 Ambro17

@Ambro17 What tools/clients do you feel would suffer from these missing keys on the result? Would you agree that this only impacts tools/clients that don't perform introspection (since introspection explicitly states which fields return tagged types) and legacy clients that don't support tagged types?

benjie avatar Nov 28 '20 10:11 benjie

Hello @benjie, I'm not really familiar with how fast spec changes reach popular graphql tools. The one that I use the most is graphiql so if these changes can suggest a way to quickly identify when we are dealing with a tagged output type perhaps is enough. My fear is that many popular tools with little maintenance will fall behind and not implement any hint towards why not all keys are present in the response. On the other hand it isn't hard to figure out why once you detect the unusual behavior. I wonder if we could include any hint of the tagged output within the query itself without messing too much with the grammar. Do you think that would be possible?

Ambro17 avatar Nov 28 '20 17:11 Ambro17

I don’t really feel it’s necessary; for a client such as GraphiQL to present those fields as options for autocomplete it would need to know they exist, which implies it knows about tagged types. Further GraphiQL just (I think) outputs the result JSON and doesn’t care about its shape too much. Older tooling would not understand the new features but presumably it also wouldn’t use the new features so shouldn’t be affected; this is the additive nature of GraphQL expansion. I still struggle to see a concrete example of how expanding GraphQL in this way would be problematic.

benjie avatar Nov 30 '20 09:11 benjie

oh no... seems the progress stalled here -- no discussions / merges / etc...

@Ambro17 I think @benjie says we need an example schema/query/etc -- something might be obvious to you, but others might need more stuff to understand.

btw, do we have minimum-graphql-version required inside graphql schema?
something like this graphql schema requires tools that understand graphql version >= 1.11...
( it'll probably be 'feature-set based' -- eg. using unions? then minimum-graphql-version-required == 1.11 / etc)

With this set-up, if a tool detects it cannot 'understand' a graphql schema due to newer graphql version required, it can probably point to a 'update-required' page (or 'sorry we're working on it' page)

devdoomari3 avatar Jan 14 '21 09:01 devdoomari3

Hey, @benjie I am wondering if this PR puts the other union aproaches off the table or not?

Tagged does not replace Union

I am trying to model FHIR ressources in GraphQL. See e.g. https://www.hl7.org/fhir/bundle.html. Bundle.entry.resource can be any type that extends Resource.

For my use case, the second enum approach mentioned in https://github.com/graphql/graphql-spec/blob/master/rfcs/InputUnion.md#-2-explicit-configurable-discriminator-field would work quite well as all FHIR ressources (in JSON representation) have a mandatory resourceType field (https://www.hl7.org/fhir/json.html). An interface approach would work similar well.

I think with tagged types, however, I think I would be required to alter the structure of my types. (I want the GraphQL Schema to exactly mirror the FHIR definitions.)

lschmierer avatar Jan 14 '21 11:01 lschmierer

@devdoomari3

oh no... seems the progress stalled here -- no discussions / merges / etc...

No, we've got an upcoming Input Unions WG to discuss it (and other things) on 21st. A comment on etiquette - comments like this come off as rather rude and unappreciative of open source developer's time and effort, I advise that you restrain yourself from making such comments in open source in future.

btw, do we have minimum-graphql-version required inside graphql schema?

No, GraphQL tends to avoid versioning. Clients that don't support particular features should ignore them.

benjie avatar Jan 15 '21 10:01 benjie

@lschmierer

I've worked a little with FHIR and I understand that desire. I think it's likely we'll only merge one of these strategies, see the guiding principles. Currently Tagged type is "in the lead" but evidence that one of the other strategies is superior is definitely welcome. Preferably ASAP, we've a meeting where we'll be discussing the solutions on 21st Jan. If you could write a convincing argument about why solution 2 is the best (not just for your use case, but in general) before then that'd be great.

benjie avatar Jan 15 '21 10:01 benjie

@benjie sorry if that came up as rude.
it't just that I can't wait for this to be merged... and thought we're all waiting for some reply from @Ambro17
(I mean, it's somewhat rude to add to the discussion if it's someone else's turn to reply, and @Ambro17 might have some examples/reasons that are obvious only to themself, but forgot to answer for some reason)

No, we've got an upcoming Input Unions WG to discuss it (and other things) on 21st

I just couldn't know about that by just reading discussions in this pull-request... it might be due to github hiding discussions (just noted there's "expand" section)

No, GraphQL tends to avoid versioning. Clients that don't support particular features should ignore them.

ok... I guess that's something I didn't know. If we embed 'list of feature-set used' in graphql-schemas, we can have tools that detect whether they support those feature-set or not. (and probably encourage tools to implement plugins for stage 1~2 proposals to gain real-world examples/lessons)

devdoomari3 avatar Jan 15 '21 17:01 devdoomari3

The idea is that introspection indicates what features are available, but we do indeed have some complexity on that front that we’re solving little by little as we get to it. Part of the solution may be two-phase introspection: step 1 to ask enough questions that are compatible with the original GraphQL spec to determine what things the schema supports, then step 2 builds an introspection query based on these detected features. I think work on this is still ongoing. We’re generally not keen to add version numbers, but indicating features is important. For the tagged type, the fact a schema has tagged types can be determined from an introspection query that’s valid against the first official GraphQL spec so it’s not an issue for this PR.

A lot of activity takes place in GraphQL working groups, so it’s worth checking the agendas and the notes in the graphql-wg repo (or, if you want the in depth run down, you can watch the recordings on YouTube; linked from the notes).

benjie avatar Jan 16 '21 23:01 benjie

Proposing a simpler solution: https://github.com/graphql/graphql-spec/pull/825

benjie avatar Feb 19 '21 17:02 benjie

Considering how ubiquitous and well-understood { ..., _tag: "MyTag" } is, I don't see why there is currently a proposal using directives. Tagged unions should complement the existing __typename mechanism. I realise that there has been a lengthy discussion and iteration through several issues and proposals, but this still feels like the simplest and most general approach.

jamiehodge avatar Sep 14 '21 08:09 jamiehodge

I don't see why there is currently a proposal using directives

There's plenty of alternative syntaxes than directives that we can use the represent the exact same thing; which of those would you prefer and would it overcome your "directives" concern?

https://gist.github.com/benjie/5e7324c64f42dd818b9c3ac2a91b6b12

Tagged unions should complement the existing __typename mechanism.

Please note that both this PR #733 and the oneof PR #825 would involve you sending the same data to the server: {"MyTag": ...}.

It sounds like your comment is in favour of solution 1 in the InputUnions proposal; one of the main issues that we found with this was that it would cause complexity around when to/not to include __typename. E.g. adding __typename to your existing input objects would throw errors in existing GraphQL implementations, so you'd have to only send it when the field is an input union; but doing so means that you need to have two representations for your type - e.g. User when used in a regular input, and UserWithTypename when used in a union-expecting interface. This would be generally unpleasant to work with.

benjie avatar Sep 14 '21 08:09 benjie