graphql-spec
graphql-spec copied to clipboard
Proposal: Support union scalar types
It makes sense often to have a field that could return, for example, an Int
or a String
. It would be convenient to be able to use a resolveType
-like mechanism to determine how the value is serialized. This would be even more useful for custom scalars.
The specification currently says: "GraphQL Unions represent an object that could be one of a list of GraphQL Object types", which basically means that unions are the same as interfaces, but without any shared fields. Supporting unions of scalars would further differentiate unions from interfaces.
Could you share a couple of concrete use cases? I ask because, personally, I'm usually reading values right out of MySQL (or running those values through transformations with predictably-typed output), so I haven't needed this yet!
@rmosolgo,
Here is a concrete use case that we've come across at my team. I have a couple more if you need them.
We've got an algorithm that produces results based on location. We have an enum AreaOfTown
that contains all possible areas of town to filter the results. Here's the catch, the algorithm can also filter locations by "near me". Because we can't simply make a union that is AreaOfTown | "NEAR_ME"
for the input type, then we have to create a completely new enum that contains all of the values in AreaOfTown
and "NEAR_ME"
. This is undesirable because if we ever change the AreaOfTown
enum by adding more (this will 100% happen in the future), then we have to make sure we also update our other enum.
If we could write scalar unions then it would solve this problem for us, among others (we would like to have constructs like Int | [Int!]
or Int | String
like @stubailo mentioned).
In our case, it's because we were returning a structure that represented a JSON tree, where keys can be integers (in the case of arrays) or strings (in the case of objects). We ended up compromising for now to just use strings, but I can imagine how this would be extra useful in the case where you want to return one of a set of custom scalars, which might have different serialization logic.
Here are other relevant discussions:
- graphql/graphql-js#207
- graphql/graphql-js#291
- #202
@leebyron made some good points in his comments here.
Allowing uniontypes to contain primitives allows us to model any kind of source that can be polymorphic in its source data. For example, avro data or elasticsearch.
This is a severe limitation on exposing existing data, rather than creating new systems that are defined by graphql from the ground up.
What's still unclear to me is what a valid query would look like which encounters such a union, and how should well-typed clients determine what to expect from such a query?
Consider:
type MyType {
field: String
}
union MyUnion = String | MyType
type Query {
example: MyUnion
}
Like with any union - it has to be prepared for any item in the union. I don't see that this poses any special problems.
On Wed, Feb 8, 2017 at 7:52 PM Lee Byron [email protected] wrote:
What's still unclear to me is what a valid query would look like which encounters such a union, and how should well-typed clients determine what to expect from such a query?
Consider:
type MyType { field: String } union MyUnion = String | MyType type Query { example: MyUnion }
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/facebook/graphql/issues/215#issuecomment-278512267, or mute the thread https://github.com/notifications/unsubscribe-auth/ABLqONw9c-XUFzYcBYSTtPU2GGBsr5J7ks5ramNMgaJpZM4KLagd .
For the above schema consider the query:
{ example }
It's clear what this query should return if the example
field returns a String
at runtime, but what should it return if the value was a MyType
?
Similarly:
{
example {
... on MyType {
field
}
}
}
It's clear what this would return for MyType
, but what should it return for a String
?
The exact same thing as querying a value that doesn't exist on a member of a union.
On Wed, Feb 8, 2017 at 8:36 PM Lee Byron [email protected] wrote:
For the above schema consider the query:
{ example }
It's clear what this query should return if the example field returns a String at runtime, but what should it return if the value was a MyType?
Similarly:
{ example { ... on MyType { field } } }
It's clear what this would return for MyType, but what should it return for a String?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/facebook/graphql/issues/215#issuecomment-278520186, or mute the thread https://github.com/notifications/unsubscribe-auth/ABLqONSR3whAYxVVrPBjdCmGOT_Py57tks5ram2rgaJpZM4KLagd .
It's not clear to me what that means. What would a valid query look like that would clearly give a value for a primitive type or an object type?
Specifically http://facebook.github.io/graphql/#sec-Leaf-Field-Selections is what I'm referring to as "valid query"
I think the original issue is about having unions where every member is a scalar right? Not unions between scalers and objects?
just an idea. Maybe it would make sense to introduce special inline fragment semantics for scalars:
{
example {
... on String {
value
}
... on MyType {
field
}
}
}
This can be generalized by allowing scalars to be used as leafs (current semantics) and as objects with a single field (value
). A huge advantage of this approach is that one can use __typename
introspection on scalars.
In our API we have a similar issue. For some parts of the schema, we even have introduced object types like StringAttribute
, IntAttribute
, etc. in order to unify object attributes ("attributes" are a part of user-defined data structures which our service supports) and scalar attributes and provide same introspection capabilities for both.
What if the result was a scalar then the result would simply ignore the selection set, and if it is an object type then it would consider the selection set. Statically, the query can be analyzed because the schema should have enough information to know if the union contains a scalar or an object or both.
If there is no selection set and the value is an object then the returned value can just be {}
. If there is a selection set and the value is a scalar then the returned value can just be the scalar value (ignoring the selection set, just like it is done for object type unions).
On the client most type system allow unions of arbitrary types, so the client can expect a result that is string | { name: string }
in Flow, for example.
Nevertheless, this starts getting into the design choices of the simplicity of GraphQL, which I think @leebyron made great points here: https://github.com/graphql/graphql-js/issues/207#issuecomment-243638767
Mixing scalars and object types would require a change to the schema which is too complex to consider. Schemaless GraphQL clients depend on the guarantee the { field }
is scalar and { field { a b c } }
is composite. Given that a scalar field could be an object this is the only reliable way to tell a scalar apart from a composite value without looking at the schema. By breaking that guarantee you introduce a lot of unnecessary complexity into the GraphQL client when you could easily just add a composite StringBox
type or similar:
type StringBox {
value: String
}
@calebmer, agreed! It's a balance between feature-set and simplicity and I think the GraphQL designers wanted to err on the side of simplicity. Please correct me if I'm wrong @leebyron.
Given this, I think that the original discussion of scalar-scalar (or more specifically, leaf nodes) unions is what we should perhaps consider (as opposed to object-scalar unions).
It seems like the Box
approach could solve most of these issues without any changes to the spec, right?
# These two fields are mutually exclusive:
type AreaOfTownBox {
stringValue: String
enumValue: AreaOfTown
}
# or
type IntOrStringBox {
stringValue: String
intValue: Int
}
It pushes the complexity to the schema but keeps GraphQL simple and predictable. Are there downsides to that approach other than verbosity?
@rmosolgo, you are right that it is possible but it comes at the cost of added complexity in the schema, verbosity in the queries, and having to check each value in the client side at runtime to determine which field was returned.
Another approach inspired by @rmosolgo’s recommendation would to have a type system that looks like:
union IntOrString = IntBox | StringBox
type IntBox {
value: Int
}
type StringBox {
value: String
}
With a query that looks like:
fragment intOrString on IntOrString {
__typename
... on IntBox { intValue: value }
... on StringBox { stringValue: value }
}
Because we can't simply make a union that is AreaOfTown | "NEAR_ME" for the input type, then we have to create a completely new enum that contains all of the values in AreaOfTown and "NEAR_ME". This is undesirable because if we ever change the AreaOfTown enum by adding more (this will 100% happen in the future), then we have to make sure we also update our other enum.
@migueloller This problem can be solved directly in your code without any change to GraphQL.
Wehn you define your types in source code just create a small utility function like extendEnumValues
in the example below:
new GraphQLEnumType({
name: 'NewName',
values: extendEnumValues(AreaOfTown.getValues(), ['NEAR_ME']);
});
As for IDL, I think it would be great to have support for it there, for example allowing extend to be used on enums. But this is a topic for separate issue/PR.
Allowing uniontypes to contain primitives allows us to model any kind of source that can be polymorphic in its source data. For example, avro data or elasticsearch. This is a severe limitation on exposing existing data, rather than creating new systems that are defined by graphql from the ground up.
@marcintustin There are number common JSON patterns that you can't express in GraphQL: key-value pairs (aka Maps), Tuples, etc. I believe this is a bad idea to make GraphQL a superset of all popular formats/protocols.
Given this, I think that the original discussion of scalar-scalar (or more specifically, leaf-nodes) unions is what we should perhaps consider (as opposed to object-scalar unions).
@migueloller There are many cases when API client can't detect type based on the value returned.
For example, union of enums having the same value. It is pretty common for enums to have common values like None
, Undefined
, etc.
Another example is a union of custom scalars with the same base type like AbsoluteUrl|RelativeUrl
.
Also, API client should explicitly specify types which he can handle using ... on TypeName {
construction. Without it, if you change Number|Boolean
to Number|Boolean|String
it will break client which don't expect the value to be a string.
That mean you can't serialize union of scalar types as a scalar value and forced to support some equivalents of __typename
and ... on ScalarType {
constructs. So the end result will look very similar to solution proposed by @calebmer
Personally, I think unions of scalar types are frequently abused. A common example of this is returning an array of strings or the single string only because array with one element looks not so nice :) It forces all API clients to add few more lines of code for every instance of such anti-pattern.
For the cases when you return arbitrary data you can always fall-back to providing value as a JSON string. Such pattern is already used by GraphQL introspection to return defaultValue
http://facebook.github.io/graphql/#sec-The-__InputValue-Type
Also, in SQL, you can have only one type per column and it doesn't prevent it from being dominate language for databases.
A common example of this is returning an array of strings or the single string only because array with one element looks not so nice
There are legitimate use cases that are not simply cosmetic, however.
Here is an example from the Facebook Marketing API. When attempting to model filtering in the API, you have an object that represents a filter expression, like so:
filtering: [{ field: "clicks", operator: "LESS_THAN", value: 42 }]
However, value
is by nature polymorphic, as it has to support expressions like this, as well:
filtering: [{ field: "campaign_id", operator: "IN", value: ["1", "2", ...] }]
Currently, this is not possible to express safely AFAIK. Scalar-union types would make this implementation trivial. Falling back to a JSON string for this seems like a poor solution. Some downsides:
- validate against the schema
- interactive tools like graphiql become less useful
- introducing another data serialization format into your resolvers, which can be a point of failure
- lack of type safety
I believe this is a bad idea to make GraphQL a superset of all popular formats/protocols.
This is a strawman. The argument does not necessitate GQL to be a superset of all protocols. Adding scalar union inputs certainly would not make GQL a superset of all protocols.
It seems like the Box approach could solve most of these issues without any changes to the spec, right?
# These two fields are mutually exclusive:
type AreaOfTownBox {
stringValue: String
enumValue: AreaOfTown
}
# or
type IntOrStringBox {
stringValue: String
intValue: Int
}
@rmosolgo The problem I see with this is you assume that the two fields are mutually exclusive; however, there is no way to statically define that in the current type system.
A common example of this is RETURNING an array of strings or the single string only because array with one element looks not so nice
@ianks The entire issue is about returning data, it was clearly stated in the initial comment
It makes sense often to have a field that could RETURN, for example, an Int or a String
It has nothing to do with input objects since they are not only missing unions of scalars but don't support unions at all. If you feel that supporting unions of types (including scalars) is essential for GraphQL please open a separate issue.
Here is an example from the Facebook Marketing API
In this example, you know all possible fields in advance, moreover, you know which filtering is possible on which field. So you can represent this in terms of the current GraphQL spec like below:
filtering: {
clicks: {
LESS_THAN: 42
},
campaign_id: {
IN: ["1", "2"]
}
}
Advantages of this are much better autocompletion and validation on the client side than when using unions of scalars.
@ianks here’s some related issues on union input types: https://github.com/facebook/graphql/issues/202 and https://github.com/graphql/graphql-js/issues/207
@calebmer
Another approach inspired by @rmosolgo’s recommendation would to have a type system that looks like:
union IntOrString = IntBox | StringBox type IntBox { value: Int } type StringBox { value: String }
There is one more alternative approach. You can just use custom scalars. GraphQL doesn't specify how custom scalars should be serialised so you can return anything, even free-form JSON. For example graphql-type-json and corresponding article in Apollo docs
Custom scalars lack types info though. But this is not an issue for using them. A notable example is GitHub API that has a number of custom scalars like DateTime
, HTML
, URI
and information about type of those scalars (string) is present ONLY in description.
I was trying to do filtering by a single ID or an array of IDs and I get here :-( my case:
union IDFilter = ID | [ID]
friends(id:IDFilter): [Friend]
now I have to go like this:
friends(id:ID, ids:[ID]): [Friend]
In my opinion and preference, the first option is better, you always filter using id:xxx
@jdonzet In your case, wouldn't it be better to just do:
friends(idsFilter: [ID!]): [Friend]
If idsFilter
is null
there is no filter, you return all Friends
... If it's not null
, the Array
must have at least one ID
...
I posed a somewhat related question here: https://stackoverflow.com/q/47933512/807674
Although, my question deals with the awkwardness of modeling union variants that should contain no data other than their mere presence (i.e. singletons). I do think it would be extremely helpful to have one blessed syntactic way of approaching this data modeling question, rather than a hodgepodge of ad hoc approaches.
At the end of the day, these all seem to be special cases of the question of "how far GraphQL should go from a data modeling perspective?" I would argue that obvious ways of representing algebraic data types would be extremely helpful. Mostly, this boils down to union
allowing singletons (and at that point, may as well allow all scalars too). There's already the concept of an enum
, which is really just a union of singletons. There's just no obvious bridge between that and proper union
.
More than just a question of syntactic convenience, it means bumpy transitions for data models as they grow in complexity.
Just commenting to say I've also got a use case similar what @acjay describes - I'm working on a query which can either resolve a user ID as a string, or optionally go resolve the user data from a remote service.
My first pass was to basically define a custom field resolve
which return a resolved user, and just map it in the query - e.g.:
query {
theResource(id: 'foo') {
user: resolve {
id
name
}
}
}
query {
theResource(id: 'foo') {
user
}
}
But ideally I'd like to implement it without needing to specify the custom type, i.e., dropping the resolve
from above and either returning immediately as a string, or if additional fields are requested, resolve them.
I'm aware that I could get around this by just having id
as part of the returned User object and conditionally resolve the rest, but I'm working with an existing REST API which returns String
| User
, and I want to mirror the behaviour.
My use case:
I am building a graphql proxy infront of an API. This Api has an array of component’s data for a given page. Now I have 20 possible components and I would rather not make 20 queries that get all of the fields the API returns just so that I can use revolvers to fetch secondary items like forms. So what I want to do is define a set of components that need to be a part of a union so I can socify what to resolve, however I want a catch all of a custom scalar that basically json wraps all fields returned.
If union types supported this I could specify what components need a type (to be added to the union) and the ones which don’t fetch extra data get caught in the JSON scalar type and all the data is serialised and no further resolvers are called.
I’ll edit this when I get to work, super Zapped but needed a brain dump
the recommendation of @calebmer seems the best fit to me, as a nice-to-have it could be partially build in as a standard. (meaning no boxes have to be created by developers).
The issue at hand is that scalar types are no object types. However as available in many languages a developer can choose to handle a scalar type as an object type or not (consider string vs String in c# for example). Following that perspective, a scalar type can be considered as a "shorthand" for the alternative object type. aka "my string" === { __typename:"string", value:"my string" }.
Within a query the user/developer is the one that makes the choice how to handle it. given:
type MyType {
field: String
}
union MyUnion = String | MyType
type Query {
example: MyUnion
exampleString: String
}
Could allow for:
example {
... on String {
value
}
... on MyType {
field {
value
}
}
}
example {
... on String
... on MyType {
field
}
}
exampleString
exampleString {
value
}
As for my use case:
In a task management system, a task can be supplied with contextual data. Of what this contextual data consists is defined by a task definition, which is created by a user. So it can be anything from a scalar to an object. Say for example a task's definition states it can have contextual data:
- "name" (a string field)
- "client" (selection from a list of clients)
When an (frontend) application wants to show the task information it sends over a query. We could just send over the client id and let the frontend look it up in another query, but that's not really harnessing the power of graphql. It would be cooler (and more performant) to let the application handle it inline:
task(id:"1234") {
name
fields {
fieldName
fieldValue {
... on String { value }
... on Client { id name }
}
}
}
There are alternatives into how to implement this of course (dynamic type system/describers/...), though a mixed union of scalar and object types would simplify a lot.