federation icon indicating copy to clipboard operation
federation copied to clipboard

Gateway: Error handling incompatible with Apollo Service

Open nrxus opened this issue 4 years ago • 6 comments

Expected

Errors from a service should be transparently proxied to the client through the gateway.

Actual

Errors from services are wrapped inside another error, their root cause hard to find, and the original http status code lost.

Description

We are using gateway + federation to do schema stitching between two services. This works great, except when there are non-200 responses in one of the services. i.e., throwing an authentication error from context creation.

Given an error response from a service that looks like:

{
   "errors":[
      {
         "message":"Context creation failed: JsonWebTokenError: invalid token",
         "extensions":{
            "code":"UNAUTHENTICATED",
            "exception": { }
         }
      }
   ]
}

HTTP status code: 400

Gateway transforms it to:

{
  "errors": [
    {
      "message": "400: Bad Request",
      "extensions": {
        "code": "INTERNAL_SERVER_ERROR",
        "response": {
          "url": "http://localhost:4000/",
          "status": 400,
          "statusText": "Bad Request",
          "body": {
            "errors": [
              {
                "message": "Context creation failed: JsonWebTokenError: invalid token",
                "extensions": {
                  "code": "UNAUTHENTICATED",
                  "exception": { }
                }
              }
            ]
          }
        },
        "exception": { }
      }
    }
  ],
  "data": null
}

HTTP Status Code: 200

Three things happened:

  1. HTTP status code changed.
  2. Top Level extensions.code changed.
  3. It is now significantly harder to realize that the root cause of the issue was unauthorized.

apollo-server, seemingly purposefully, does not set the HTTP status codes that one might expect when throwing an error: i.e., AuthenticationError. I say purposefully based on my read of: https://github.com/apollographql/apollo-server/issues/1709

This is okay, and I can see the rationale behind it, however this directly conflicts with how gateway checks for errors in remote graphql data sources: https://github.com/apollographql/apollo-server/blob/6c72193a7ace54b8522c69f3ba26ab38da37ff9d/packages/apollo-gateway/src/datasources/RemoteGraphQLDataSource.ts#L203-L227

The problems with this function are:

  1. Gateway is explicitly looking at the http status code of the responses to see how it needs to proxy the errors, but the apollo server explicitly does not use those status codes that way.
  2. Any errors from the service get wrapped instead of merely stitched together as a successful response would.

This makes error handling in the client significantly harder, as one can't as easily detect unauthorized errors to force the user to log out or refresh the token. This is also bad UX because the implementation detail of the gateway is now leaking through. Successful responses do not have such leakage of the implementation. The client shouldn't care if the it's calling a gateway or a service directly, but now we are forcing it to care due to error handling.

If we are doing something unidiomatic/surprising that is leading to this weird behavior based on the description given above then it would be great to have the idiomatic way documented as I couldn't find much information in the gateway docs.

nrxus avatar Jul 14 '20 02:07 nrxus

Agree. This also makes it difficult (or impossible...) to handle any predefined errors that are documented and suggested to be used by the Apollo docs themselves, i.e. UserInputError ( https://www.apollographql.com/docs/apollo-server/data/errors/#augmenting-error-details ).

If I hit the service directly, I get the nice error object from which I can take the error.message or error.code which equals to BAD_USER_INPUT in the case of UserInputError.

If I hit the gateway, I get it encoded as a JSON inside of the error.message:

image

(It's not even placed as a JSON object, so its' parsing is even harder.)

jtomaszewski avatar Jan 25 '21 20:01 jtomaszewski

FYI, we're having currently such a workaround to "unwrap" the original error data to the federated gateway response:

  return new ApolloServer({
    ...serverConfig,
    formatError(gqlError) {
      try {
        if (gqlError.message.startsWith("Unexpected error value: { message:")) {
          const json = gqlError.message
            .replace("Unexpected error value: ", "")
            .replace(
              / (message|path|extensions|variables|exception|0|1|2|code|serviceName|query):/g,
              ' "$1":'
            )
            .replace(/\[Object]/g, '"[Object]"');
          const originalError = JSON.parse(json);
          return {
            ...gqlError,
            message: originalError.message ?? gqlError.message,
            extensions: {
              ...gqlError.extensions,
              code: originalError.extensions?.code ?? gqlError.extensions?.code,
            },
          };
        }
      } catch (error) {
        return error;
      }
      return gqlError;
    },

jtomaszewski avatar Jan 25 '21 21:01 jtomaszewski

In my use case I was able to get away with this in our gateway:

const server = new ApolloServer({
    formatError(error) {
      // gateway will nest 4xx error into another error
      // this leaks implementation details and it is hard
      // for consumers to get the real error
      if (error.extensions?.response?.body?.errors) {
        return error.extensions.response.body.errors[0]
      }
      return error
    },
    gateway,
    /* other properties of apollo server */
  })

I don't think it is the most reliable way to get the errors out but it works well enough for now. 🤷🏽

nrxus avatar Jan 26 '21 03:01 nrxus

Hi, I'm trying to workaround this problem as well. Do you have any ideas how to deal with multiple errors in service response? Originally what's returned from gateway is

{
    "errors": [
        {
            "message": "400: Bad Request",
            "extensions": {
                "code": "INTERNAL_SERVER_ERROR",
                "response": {
                    "url": "<service url>",
                    "status": 400,
                    "statusText": "Bad Request",
                    "body": {
                        "data": {
                            "alias": null,
                            "alias2": null
                        },
                        "errors": [
                            {
                                "message": "Service error message",
                                "locations": [
                                    {
                                        "line": 1,
                                        "column": 10
                                    }
                                ],
                                "path": [
                                    "alias"
                                ]
                            },
                            {
                                "message": "Service error message",
                                "locations": [
                                    {
                                        "line": 1,
                                        "column": 80
                                    }
                                ],
                                "path": [
                                    "alias2"
                                ],
                            }
                        ]
                    }
                }
            }
        }
    ],
    "data": {
        "alias": null,
        "alias2": null
    }
}

The way shown https://github.com/apollographql/federation/issues/379#issuecomment-767259894 works for single error (with [0]) but returning an array from mapper results in the following structure (Array inside errors array) which is not compatible with other libs. Otherwise some errors are lost.

{
    "errors": [
        [
            {
                <error 1>
            },
            {
                <error 2>
            },
        ]
    ]
}

Of course it would be better to fix the root cause but maybe you can share your thoughts on that? Thanks

lookprzybylski avatar Jun 11 '21 14:06 lookprzybylski

I'm facing similar issue with multiple errors returned from federated service and actually I was thinking about patching it somewhere here: https://github.com/apollographql/apollo-server/blob/6c72193a7ace54b8522c69f3ba26ab38da37ff9d/packages/apollo-server-errors/src/index.ts#L244-L262.

Looks like similar issue might occur in different scenarios also.

kuebk avatar Nov 22 '21 18:11 kuebk

Hi, I've found a workaround that will work even with multiple errors.

const gateway = new ApolloGateway({
  buildService(
    {
      /** ...name, url **/
    }
  ) {
    return new RemoteGraphQLDataSource<Context>({
      /** ... */
      async errorFromResponse(response) {
        const text = await response.text();
        const parsed = JSON.parse(text);
        const error = parsed.errors[0] as ApolloError;
        return new GraphQLError(error.message, {
          extensions: error.extensions
        });
      }
      /** ... */
    });
  }
});

I'm posting it here since the issue is still open and I did not find any solution to this problem, please let me know if someone has already found a better solution.

nocfer avatar Sep 08 '22 15:09 nocfer