graphql-spec
graphql-spec copied to clipboard
Proposal: Serial fields (nested mutations)
In the current GraphQL specification the schema type have three different types on it: query
, mutation
and subscription
.
The mutation type is a special ObjectType
that enforces the fields on it to be resolved serially
instead of in often parallel
as could happen in fields in any other type
When executing a mutation, the selections in the top most selection set will be executed in serial order. [1].
But this "mutation" type declaration is not self-contained: when defined in a schema as mutation it have a side effect on the execution of a query, making their fields to be executed serially.
Because of there is only one type in a schema that enforces a serial evaluation on their subfields, all "mutation fields" are forced to be in the Mutation type itself.
Having all "serial resolved fields" in only one types causes the Mutation type (schema.mutation) to be an ObjectType
with over bloated fields on it, that have not common ground between them other than serial execution.
Proposal
Would be great if instead of having all the mutations in a first type class (where it's execution depends on the Schema for knowing if it should be resolved serially or not), we could be able to mutate in a generic GraphQL query, in any kind of type level.
Would be great, if instead of doing:
mutate MyMutation {
postToUserTimeline(userId: "1", commentContent: "Happy Birthday Mark!") {
comment { # The new created comment in Mark's timeline
id
content
}
user {
timeline {
id
content
}
}
}
}
We could be able to do serial resolution on certain fields:
{
getUser(id:"1") {
postToTimeline(commentContent:"Happy Birthday Mark!") { # serial field
comment {
id
}
}
allComments {
id
comment
}
}
}
Composing mutations
With the ability to have "mutation / serial fields" in any kind of type, we could start doing things like:
{
user(id:"1") {
postToTimeline(commentContent:"Happy Birthday Mark!") { # This is a serial field
comment {
addReaction(reaction: LIKE) # This is another serial field, that adds a reaction into the comment
}
}
}
}
Execution showcase
The following example hopefully will showcase how the field resolution could be done:
{
user(id:"1") {
postToTimeline # serial
comments # normal
photos # normal
tagPhoto # serial
postAgain: postToTimeline # serial
secondPhotos: photos # normal
secondComments: comments # normal
}
}
In this case, the execution will done serially as follows:
- Resolve
postToTimeline
(not resolve any other field until this is resolved) - Resolve
comments
,Photos
normally (often in parallel) - Resolve
tagPhoto
- Resolve
postAgain
- Resolve
secondPhotos
andsecondComments
normally (often in parallel)
How could be implemented
This could be easily solved by creating a adding a isMutation
(isSerial
maybe is better?) attribute inside the Field.
In the previous example, postToTimeline
will be executed serially first, then in parallel all the subsequent fields until another "serial field" (field with isSerial=true
on it) is found, and so on.
Why of this proposal
There is a lot of tooling that could be created specifically around mutations, and this would help a lot for it, apart from simplifying the schema execution (as are now will be the fields the ones that define if they want to be executed serially or not) and mutations composition.
Related issues: #247
There a lot of things to improve, define better and discuss further, so any thoughts and feedback is more than welcome! 😊
I feel that this proposal would be simpler if the isMutation
or isSerial
flag was on an Object Type instead of individual fields. That way, you could just select the right execution algorithm for each selection set based on the type, rather than needing to consider every field individually which could end up being messy.
Your above usecase could be implemented in such a system by putting a mutations
field on the comment:
{
user(id:"1") {
postToTimeline(commentContent:"Happy Birthday Mark!") { # This is a serial field
comment {
mutations {
addReaction(reaction: LIKE) # This is another serial field
}
}
}
}
}
I'd suggest this could come with a small change to the type language:
type X { .. } # Object type
input X { .. } # Input type
mutation X { .. } # Serial executed type
Of course, language of mutation
vs. serial
vs. something else is up for grabs.
One question about this entire approach though is that some people use the mutation
keyword at the top to decide whether they should use a read-only connection to the DB, or whether to cache a particular result. So perhaps it's worthwhile to ensure that mutation fields in the query can only appear if the entire operation is a mutation, and throw a validation error otherwise.
I understand your proposal is aiming to improve the spec. But I just wanted to point out that supporting "nested mutations" is entirely possible within the existing GraphQL spec. Here's an example of how it could look: https://blog.graph.cool/improving-dx-with-nested-mutations-af7711113d6#.hsni6ntd3
Yes in the sense that any field on the GraphQL schema can have side effects. But then you don't get the nice feature of the fields executing in serial rather than in parallel, unless your execution engine supports that (which isn't part of the spec since the spec does specify how execution is supposed to run).
@leebyron ping :)
@leebyron where are we with this? Seems extremely useful
I think this would be extremely useful, for running mutations that have a relation deep in a hierarchy!
To me this solves one of the remaining pain points in GraphQL that REST actually solved well, which is to have actions associated with individual resources.
Right now you have to create all your mutations at the top-level, and pass in more complex identifiers if you want to references some deeply nested object. You essentially have to re-model your hierarchy in a flat set of arguments since mutations can only be flat.
Fields
I think marking specific fields as mutations
or serial
is needed, instead of the idea of having certain types be serial. Because it would be reasonable to want to do:
type User {
id: ID!
name: String!
*createTeam(input: TeamInput!): Team!
}
type Team {
id: ID!
name: String!
*createProject(input: ProjectInput!): Project!
}
type Project {
id: ID!
name: String!
}
{
user(id: "user_id") {
createTeam(input: { id: "team_new", name: "New Team" }) {
id
name
}
}
}
In this case Team
is both a type returned from normal queries, and a type returned from a mutation/serial query. So I think it might be too restricting to have to create entirely different types for mutation responses.
Execution
That said, I think the execution could be simplified, to make it work almost exactly like the current architecture's execution does. I think @syrusakbary's original proposal of making the execution defined by the fields order feels a bit too much like magic, and might result in some strange unintended behaviors.
Instead, I think mutating fields should be treated as execution boundary. All of the non-mutating fields of a query are resolved first, in parallel. Any mutations that are encountered are added to the next stack. Then the mutation stack is resolved one by one. The results of each mutation are resolved in parallel again (as a normal query) until further mutations are discovered.
So for instance, for a complex query like:
{
user(id: "user_id") {
id
name
createTeam(input: { id: "team_new", name: "New Team" }) {
id
name
createProject(input: { id: "project_new", name: "New Project" }) {
id
name
}
}
update(input: { name: "User Name Updated" }) {
id
name
}
}
team(id: "team_id") {
id
name
update(input: { name: "Team Name Updated" }) {
id
name
}
}
}
The execution would resolve as:
- Resolve
user
andteam
fully in parallel, discoveringuser.createTeam
,user.update
andteam.update
in the process. - Resolve the first mutation in the top-level stack,
user.createTeam
in serial.- Resolve the output of
user.createTeam
in parallel, discoveringuser.createTeam.createProject
. - Resolve the first mutation in the
user.createTeam
stack,user.createTeam.createProject
in serial. - Resolve the output of
user.createTeam.createProject
in parallel. - Finished resolving this branch.
- Resolve the output of
- Resolve the second mutation in the top-level stack,
user.update
in serial.- Resolve the output of
user.update
in parallel. - Finished resolving this branch.
- Resolve the output of
- Resolve the third mutation of the top-level stack,
team.update
in serial.- Resolve the output of
team.update
in parallel. - Finished resolving this branch.
- Resolve the output of
- Finished resolving the query.
Note how each mutation creates its own isolated resolution context. But the non-mutating queries are always resolved in parallel for faster execution.
And the result would look like:
{
user: {
id: 'user_id',
name: 'User Name',
createTeam: {
id: 'team_new',
name: 'New Team',
createProject: {
id: 'project_new',
name: 'New Project',
}
},
update: {
id: 'user_id',
name: 'User Name Updated',
}
},
team {
id: 'team_id',
name: 'Team Name',
update: {
id: 'team_id',
name: 'Team Name Updated',
}
}
}
This is entirely possible in the current implementation. I wrote about it extensively in this article here https://medium.com/@jdylanstewart/the-real-power-behind-graphql-and-an-architecture-to-leverage-it-7d1dd1004ada
This just comes down to defining your resolvers with fluent interfaces, where they return the object they are acting upon and only slightly changes the query @ianstormtaylor wrote above.
input UserUpdateInput {
name: String
}
type User {
id: ID!
name: String!
team: Team
update(input: UserUpdateInput!): User!
assignTeam(byId: ID!): User!
}
input TeamUpdateInput {
name: String
}
type Team {
id: ID!
name: String!
update(input: TeamUpdateInput!): Team!
assignProject(byId: ID!): Project!
}
type Project {
id: ID!
name: String!
}
type Mutation {
createTeam(id: ID!, name: String!): Team!
createProject(id: ID!, name: String!): Project!
createUser(id: ID!, name: String!): User!
}
type Query {
user(byId: ID!): User
team(byId: ID!): Team
project(byId: ID!): Project
}
query {
createTeam(id: "team_new", name: "New Team") {
id
}
createProject(id: "project_new", name: "New Project") {
id
}
user(byId: "user_id") {
id
name
assignTeam(byId: "team_new") {
team {
id
name
assignProject(byId: "project_new") {
project{
id
name
}
}
changeName(name: "Team Name Updated") {
id
name
}
}
}
changeName(name: "User Name Updated") {
id
name
}
}
}
At this point it's entirely a function of your data architecture design.
It would be more logical to have all fields in a mutation document to resolve syncronously. As all execution contexts under a mutation should be expected to make a change to the underlying data layer.
This change will ensure consistancy and order of the mutation execution context, with the added benefit of legibility of the overall document structure.
Implementation of this in the graphql-js repository was incredibly easy with little to none side effects, except for a string check on a resolveSubQueries which is a function shared between query and mutation, a proper fix would separate the execution trees of mutation and queries. (performance hit vs code bloat?)
The Impact to this change would be performance only for mutations as syncronous execution of the sub interfaces will never be as fast as asyncronous
I have a modified fork that satisfies this requirement as I have developed a data binding library that heavily uses nested mutations to structure functionality with legibility.
npm repository: @vostro/graphql
package.json - otherwise you will end up with a ton of nested copies of graphql from other libraries
"resolutions": {
"graphql": "npm:@vostro/graphql",
}
yarn add graphql@npm:@vostro/graphql
Databinder Library - For Reference
Originally while developing the above I may have misread the spec and assumed that all fields would be executed syncronously which would satisfy my requirements of functionality segregation while having a consistent execution order without creating long and complicated names for each function
Legibility benefit
mutation moveTaskItemToTask($taskId: ID, $taskItemId: ID) {
models {
TaskItem(update: {where: {id: $taskItem}, input: {task: {remove: {}}}}) {
id
}
Task(update: {where: {id: $taskId}, input: {items: {add: {id: $taskItemId}}}}) {
id
items {
edges {
node {
id
}
}
}
}
}
classMethods {
Task {
alert(taskId: $taskId){
id
}
}
}
}
vs (Current Spec - Top Level Execution order only)
mutation moveTaskItemToTask($taskId: ID, $taskItemId: ID) {
ModelTaskItem(update: {where: {id: $taskItem}, input: {task: {remove: {}}}}) {
id
}
ModelTask(update: {where: {id: $taskId}, input: {items: {add: {id: $taskItemId}}}}) {
id
items {
edges {
node {
id
}
}
}
}
ClassMethodsTaskAlert(taskId: $taskId){
id
}
}
Even if it wasnt for the sake of legibility, It makes more sense for every execution instance in a mutation document to be run syncronously as they should all be observed as to be modifing the underlying data layer.
*Edit - 20200813 - updated link to databinder lib
@dyst5422 I don't think that's true, because it's missing the current mutation-only behavior of having the mutations run serially. You'd end up with an architecture that runs the risk of race conditions I'd think.
@ianstormtaylor I think it really comes back to it being parallel
(instead of query
) and serial
(instead of mutation
) operations.
At this point it's entirely a function of your data architecture design.
@dyst5422 using your schema from above do you know what this user's name will be?
mutation {
createUser(id: "1" name: "jane") {
update({ name: "bob" }) {
name
}
update({ name: "alice" }) {
name
}
name
}
}
the point of building APIs is to encapsulate your data and logic, isn't it? If so, then all the logic how an interaction from UI should just go with one mutation, and its resolver should be the one that decides which database operations that have to be carried out and the order of them. "Nested mutation" or "serial field" are workarounds to bring the aforementioned responsibility back to UI, which IMHO, is a non-goal for graphql.
Agreed with @tungv . In general, I understand what this proposal is trying to achieve. But this happens at the cost of complicating not only the implementation, but also the design as a whole. There are a lot of opportunities for racing conditions.
I must admit that I understood far from everything in this discussion. But what I understand absolutely for sure is that I would not want to some way or another bring the execution flow control constructs to the API level.
I enjoy schemas which closely resemble their underlaying data sources. This lets me reliably understand––from looking at any given query––what data it retrieves. Why wouldn't the same approach apply to mutations? As of today, a mutation does not necessarily convey its underlaying operation(s). The early comments of this discussion (@ianstormtaylor @Azerothian) strike me as heading towards a more intuitive approach:
- users can define CUD resolvers per-field
- users can nest those mutations the same way as they do for queries
In other words, if this is possible:
{
user(id:"1") {
timeline(nDays: 20) {
comment {
id
content
}
}
}
}
... then someone intuiting the language would presume the following is also possible:
{
user(id:"1") {
timeline {
comment {
add(content: "Happy Birthday!") {
id
content
}
}
}
}
}
It's always a tough discussion when the question is "what should the language be?" It seems to me that this addition might make the language easier to intuit and read.
I'm really curious to hear more feedback from the community!
Any update on it?
I personally do not think that putting mutations inside types is a good idea (if only from a security perspective it would be a nightmare). But I do like something similar to @harrysolovay so I also suggest using the mutation
keyword to define those things like proposed by @stubailo. But we might still want access to a parent in those children fields, so we could also reuse the on
syntax of fragments.
It could look like:
type Comment {
id: ID
content: string
}
mutation TimelineMutations on Timeline {
addComment(content: string): Comment
}
mutation UserMutations on User {
timeline: TimelineMutations
}
type Mutation {
user(id: ID!): UserMutations
}
It would respect the following rules:
- The names would be exclusive between mutations and types (unless we get proper namespacing, see https://github.com/graphql/graphql-spec/issues/163).
- Everything defined in mutations is expected to resolve sequentially.
- fields defined in mutations can return a mutation, a scalar or a type
- types cannot return a mutation
- if a mutation defines a
on Type
, the parent of all the fields will be an instance ofType
However this is (to be) accomplished, it needs to be. This is very important for numerous reasons in situations with complex object model. (Some of) these are already noted by other posters here so I may be just rephrasing to bring attention:
- Creation of new objects may require scoping / nesting to be specified, within another existing object. This implies both a creation of the new object and linking/relating it to its "container". Often these cannot be separated and must be done as a single transaction. This becomes more than just an order, it isn't about serial or parallel ... but together.
- ⬆️ [1] may get more complicated if the object being created (and added to existing container) must be created with more objects inside it (also to be created) at the same time.
- Modifying (updating) objects can be relatively simple in a "flat mutations world" but if we restrict ourselves only to modifying its non-array fields. Array fields have a "habit" of recognizing more types of changes than just "wipe out what is there and replace it with what I'm telling you now" - things like "add these", "remove these", "keep/retain these", "clear all" in addition to "set" (and there could be more or these could be restricted to what access control system grants). In a flat world this would require either an unwieldy matrix of type-field-modification mutations or equally unwieldy input type field explosion (e.g. for each array field itemsToAdd, itemsToRemove, itemsToKeep, ...). While nested mutations would also require the same actions to be listed they are much more natural.
- I'm still "learning" GraphQL but one thing evades me, perhaps because it doesn't exist - and it would be largely "helped" by nested mutations... Schema is discoverable and it supplies info on which types are available and they fields. But, beyond few different conventions, there is no way to discover which operations (queries, mutations) can be done with each. I get that some use cases do not have CRUD operations, for example, but I don't think it is good to constrain GraphQL to only commonly needed functionality - it can offer (and not require) more. With nested mutations we could have equally-named operations (CRUD or otherwise, optionally) in each type instead of concocting query/mutation names for each. Instead of something xyzById(...), xyzsMatching(...), newXyz(...), deleteXyz(...), modifyXyz...(...), addXyzToPqr(...), removeXyzFromPqr(...), ... we can consistently have byId(...), matching(...), create(...) or new(...), modify(...), ... inside each (and again, no, these don't have to be the names).
- I may also be missing something but I was trying to find a way to use the output/result of a query (or mutation) inside subsequent queries/mutations. Initially I was looking for an ability to reference the output of a previous query/mutation from a subsequent one by naming it somehow. I have not found that. This could help with that. Heck, it could even express stuff like "perform this mutation on all objects found by that query" without having to further extend or complicate the flat mutations list.
Finally, GraphQL should not be treated as primarily a JavaScript thing (or any other language or particular server). It is a lot more than that. We already have the case where servers are not created equal and can't see how that can ever change - or that it should. Some servers simply don't care/need some features, others do. That is fine. GraphQL clients have to deal with this already, whether they like it or not ... and they aren't created equal either.
I am trying to get this, or something like this, officially discussed by the GraphQL WG. Please see https://discord.com/channels/625400653321076807/862834364718514196/909151063704240129
I will to bring more attention to this issue in the upcoming GraphQL Working Group meeting. To that end I wrote a markdown document outlining somewhat more of a context and included this topic, as well as links to this issue. If you have interest and time, please help in doing this right, either way. For example, please let me know if I missed something, either good or bad ... or you feel should be stated.
If interested, please read:
https://github.com/aleksandarsusnjar/graphql-proposals/blob/main/graphql-mutations-and-large-schemas.md
@aleksandarsusnjar Are there any news from the december meeting on this topic? Having the ability to group the mutations is quiet an important topic for my company.
I could not find any meeting notes or something on that.
And by the way thanks for the effort of the proposal.
https://github.com/graphql/graphql-wg/blob/main/notes/2021/2021-12-02.md#discuss-graphql-mutation-designs-with-large-schemas-nested-mutations-30m-aleksandar
@benjie working group link is not longer valid Here's correct one: https://github.com/graphql/graphql-wg/blob/main/notes/2021/2021-12-02.md