graphql.github.io icon indicating copy to clipboard operation
graphql.github.io copied to clipboard

Enhance documentation on error handling in GraphQL

Open hitherejoe opened this issue 1 month ago • 8 comments

Description

Added section on modeling errors as data in GraphQL APIs, detailing recoverable and unrecoverable errors, and how to structure mutations and queries to handle errors effectively.

This has been added based off some conversations with @Urigo and some of the team at The Guild.

A couple of notes:

  • I added this as a new part of the Errors section. While the "Errors as Data" section is quite long, this felt like the best fit.
  • I've adjusted the examples to match the existing operations that are used in the page, happy to tweak things more if need-be!

hitherejoe avatar Oct 29 '25 14:10 hitherejoe

@hitherejoe is attempting to deploy a commit to the The GraphQL Foundation Team on Vercel.

A member of the Team first needs to authorize it.

vercel[bot] avatar Oct 29 '25 14:10 vercel[bot]

CLA Signed
The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: hitherejoe / name: Joe Birch (b7cded6423e9b4222b11f9ff71a5a4014cfa4402)

Hi @martinbonnin @hitherejoe,

This topic is quite deep and there are a wide range of options, which requires some level of familiarity with GraphQL.

I feel the concepts may get lost in the "Learn" section (assuming this is the "The basics"). Would the "Best Practices" be a better place for an advanced concept like this?

eddeee888 avatar Nov 11 '25 13:11 eddeee888

Would the "Best Practices" be a better place ?

Agree 100%

martinbonnin avatar Nov 11 '25 15:11 martinbonnin

Would the "Best Practices" be a better place ?

@eddeee888 @martinbonnin agreed on moving this, i'll get that adjusted. It does feel like this section is quite deep, especially with the nested headings 😀

hitherejoe avatar Nov 18 '25 09:11 hitherejoe

Hi folks! First, thanks so much @hitherejoe for this submission, it's really great to start making some recommendations in this area via the GraphQL website. Secondly, thanks all for the discussion around this! As can be seen, it is quite a subtle topic with lots of people having subtly (or sometimes radically) different opinions on it...

I started by writing my own opinions into this GitHub comment but they got quite long (it's a complex topic), so instead I've broken my thoughts out into an article on my blog:

https://benjie.dev/graphql/errors

In general I find myself agreeing with @hitherejoe; but with some nuance (see my article for definitions of the terms I'm using)... Here are a summary of my opinions:

  1. The only[^1] valid place to model errors in the schema is as part of mutation fields (whether via a union return type or details on the mutation result payload)
  2. I categorize errors differently, instead of recoverable and non-recoverable, I refer to "domain errors" vs "exceptions".
  3. Domain errors should be modelled on mutation results (only), though using GraphQL errors for this is also acceptable if you're happy with the trade-offs.
  4. Exceptions should never be modelled in the schema.
  5. null should always be used for "does not exist" when looking up an individual resource
  6. Permission denied in a query is equivalent to non-existence - the resource you're attempting to request does not exist within the set of resources you're allowed to access. Thus the only time where NOT_FOUND or FORBIDDEN errors should be raised would be in mutation fields, and even then it should respect the principle that it should not reveal whether or not a resource you're not allowed to access exists (i.e. if you're not allowed to know the resource exists, NOT_FOUND should be thrown, rather than FORBIDDEN).
  7. A union return type for mutation fields is not a one-size-fits-all solution; a mutation payload with union field on the result is more flexible at the cost of being less explicit. We should not dictate the "best practice" here when it has known issues, instead teams should make their own decisions.

For the nuances of my thoughts, please read my article :) I'll probably update it over the next few weeks as I frequently do with these articles since I wrote it in a rush.

[^1]: Okay not strictly the only place, but I don't agree 95%... I'd say it's more than 99.5% which is basically enough for us to say "only" and for people who know better to be able to justify their decisions.

benjie avatar Nov 20 '25 13:11 benjie

For the record, @michaelstaib pointed me to this interesting post from @xuorig : https://productionreadygraphql.com/2020-08-01-guide-to-graphql-errors

It's from 2020 and provides a good background on the FooPayload vs FooResult conventions as well as many other things.

@xuorig, if you're still around, how much of it is still valid in 2025? Anything you'd change with 5 years of feedback?

martinbonnin avatar Nov 20 '25 14:11 martinbonnin

Hi @martinbonnin 👋

I believe all of it is still very relevant / valid. My 2 cents, I would focus on overarching best practices rather than specific patterns, since domains/clients/server implementations can often make certain patterns easier or harder.

Things like:

  • Having wrapper types for evolution (probably the one I'd be most confident about recommending)
  • Dev-facing errors in errors array vs domain errors in the schema. (Although in real life there are real challenges to this, so I've softened my stance on this over the years)

Side-note: I would not use the wording unrecoverable vs recoverable to describe what goes into the errors array vs the schema. Both exceptions and "expected" errors can potentially be recoverable (by retrying for example)

xuorig avatar Nov 20 '25 16:11 xuorig