redwood icon indicating copy to clipboard operation
redwood copied to clipboard

[Bug?]: Expired JWTs causes a 500 to be thrown

Open aaronvanston opened this issue 2 years ago • 7 comments

What's not working?

We have Redwood configured with auth0 and an external client (NextJS and React Native App) communicating with our Redwood API.

It's possible for our client to sometime send through expired JWT before the client catches this and either refreshes their token or requests them to log in again.

When this happens the Redwood API successfully catches this expired token when building the auth context but then throws it as a 500 error.

{"level":50,"time":1660094936173,"pid":16,"hostname":"XXX.XXX.XXX.XXX","name":"graphql-server","msg":"Error building context. Error: Exception in getAuthenticationContext: jwt expired"}

This error is caught and handled fine but is causing issues with monitoring the health of our API. We track 500 status errors being thrown and use this metric to alert our team of a potential issue. This is not currently possible with the ability of expired tokens to raise a 500 causing significant noise.

Expectation

Given the auth token is expired, I'd expect the API to return a 401 HTTP status.

Update: Or per GraphQL spec a 200 HTTP Status with a formatted error for unauthorised.

Mitigations

We're proactively putting protections in place to prevent our clients from sending expired tokens to help mitigate this. However, it's always possible for someone to directly send expired tokens or a future where we provide an open API where we can't control the client interfaces.

This ultimately is a low priority item but more so a difference in expectation.

How do we reproduce the bug?

  1. New installation of RW
  2. Configure Auth0 as a provider
  3. Capture a JWT now
  4. Re-use the JWT after it has expired (typically 24 hours)
  5. Server should return the following:
{"level":50,"time":1660094936173,"pid":16,"hostname":"XXX.XXX.XXX.XXX","name":"graphql-server","msg":"Error building context. Error: Exception in getAuthenticationContext: jwt expired"}

What's your environment? (If it applies)

System:
    OS: macOS 12.1
    Shell: 5.8 - /bin/zsh
  Binaries:
    Node: 16.13.0 - /private/var/folders/zr/321c2cp953160v7ry44t12p40000gn/T/xfs-c6772235/node
    Yarn: 3.2.2 - /private/var/folders/zr/321c2cp953160v7ry44t12p40000gn/T/xfs-c6772235/yarn
  Databases:
    SQLite: 3.32.2 - /Users/aaronvanston/Library/Android/sdk/platform-tools/sqlite3
  Browsers:
    Chrome: 104.0.5112.79
    Firefox: 93.0
    Safari: 15.2
  npmPackages:
    @redwoodjs/core: 2.2.0 => 2.2.0

Are you interested in working on this?

  • [ ] I'm interested in working on this

aaronvanston avatar Aug 10 '22 03:08 aaronvanston

Thanks @aaronvanston for taking the time to report this in such details - we will look into it.

noire-munich avatar Aug 10 '22 09:08 noire-munich

@aaronvanston Would being able to set an option in the auth0 decoder for a leeway or clockTolerance option help mitigate certain expirations?

See: https://github.com/auth0/node-jsonwebtoken#jwtverifytoken-secretorpublickey-options-callback

clockTolerance: number of seconds to tolerate when checking the nbf and exp claims, to deal with small clock differences among different servers

Or is this also an issue with longer expired tokens that the client still retains well after expiration?

dthyresson avatar Aug 11 '22 15:08 dthyresson

FYI - The Error building context. Error: Exception in getAuthenticationContext: jwt expired message comes from the plugin to build the auth context; See:

https://github.com/redwoodjs/redwood/blob/0e087ade5ad689efdeedd5e846749a17158c172b/packages/graphql-server/src/plugins/useRedwoodAuthContext.ts#L31

dthyresson avatar Aug 11 '22 15:08 dthyresson

Also, this issue where the 400 error is returned could be related.

https://github.com/redwoodjs/redwood/issues/5057

The error is returned without have the error masked with a 200 status. It returns before the graphql handler executes (in the parse or validation phase). It could be that on building context, the handler throws an error and a 500 is returned instead.

dthyresson avatar Aug 11 '22 15:08 dthyresson

The error is returned without have the error masked with a 200 status. It returns before the graphql handler executes (in the parse or validation phase). It could be that on building context, the handler throws an error and a 500 is returned instead.

Yes I also think this is what's happening. When it tries to build the auth context (using the envelop plugin) the JWT decoder throws.

Maybe the solution is to wrap decoders with a try catch and on decoder error see if its possible to return a 401?

dac09 avatar Aug 12 '22 13:08 dac09

Upgrading to "@graphql-yoga/common": "2.12.6" now catches the context building error and the error is masked.

But, this does return a 200 status and not 401.

According to the GraphQL Spec: https://github.com/graphql/graphql-over-http/blob/main/spec/GraphQLOverHTTP.md#applicationjson

The server SHOULD use the 200 status code, independent of any GraphQL request error or GraphQL field error raised.

Note: A status code in the 4xx or 5xx ranges or status code 203 (and maybe others) could originate from intermediary servers; since the client cannot determine if an application/json response with arbitrary status code is a well-formed GraphQL response (because it cannot trust the source) the server must use 200 status code to guarantee to the client that the response has not been generated or modified by an intermediary.

If the GraphQL response contains a non-null {data} entry then the server MUST use the 200 status code.

Note: This indicates that no GraphQL request error was raised, though one or more GraphQL field error may have been raised this is still a successful execution - see "partial response" in the GraphQL specification.

The server SHOULD NOT use a 4xx or 5xx status code.

Note: For compatibility with legacy servers, this specification allows the use of 4xx or 5xx status codes for failed requests where the response uses the application/json media type, but it is strongly discouraged. To use 4xx and 5xx status codes, please use the application/graphql-response+json media type.

So, lots to digest and need to confirm with the Guild ion the proper content types and codes here.

But since the errors is masked, and well the data is null, maybe a 401 is allowed per spec.

Still investigating the proper behavior.

dthyresson avatar Aug 16 '22 13:08 dthyresson

Or is this also an issue with longer expired tokens that the client still retains well after expiration?

@dthyresson Apologies for the delay on this one! GH notification got lost in the noise!

Yeah I suspect most of our tokens are quite old and thus would not really matter being able to tune the clock tolerance.

But, this does return a 200 status and not 401.

Yeah per spec we'd be fine with a 200 and then allow our client to handle the auth error.

aaronvanston avatar Aug 31 '22 00:08 aaronvanston

@aaronvanston Having upgraded to Yoga v3 I am finally revisiting how errors and status codes are reported.

And I have a new take on this issue.

jwt.verify(token, 'shhhhh', function(err, decoded) {
  if (err) {
    /*
      err = {
        name: 'TokenExpiredError',
        message: 'jwt expired',
        expiredAt: 1408621000
      }
    */
  }
});

I now wonder if the Auth0 (or any/every JWT decoder) should check for this error and then raise a RedwoodError compatible Authentication error that will have a status code of 401.

Probably would handle

jwt.verify(token, 'shhhhh', function(err, decoded) {
  if (err) {
    /*
      err = {
        name: 'NotBeforeError',
        message: 'jwt not active',
        date: 2018-10-04T16:10:44.000Z
      }
    */
  }
});

too.

But, the preferred behavior is a 401 status, correct?

With Yoga v3, this is possible. I'd just have to change the decoders.

Would that be best?

dthyresson avatar Jan 09 '23 21:01 dthyresson

Note: Would need to update Supabase decoder, too.

dthyresson avatar Jan 09 '23 21:01 dthyresson

To implement, I might create an AuthenticationError or TokenExpiredError or NotBeforeError in redwoods/api that extends from RedwoodError.and set a code and stop stat in the extensions.

Should be fine since the api packages for each auth depends on redwoods/api

dthyresson avatar Jan 09 '23 21:01 dthyresson

Thanks @dthyresson!

But, the preferred behaviour is a 401 status, correct?

Happy for whatever aligns best with the spec. I think a 401 would be preferred as I think our client would automatically handle this, but a 200 with a specific error a client could handle would also work fine

aaronvanston avatar Jan 16 '23 22:01 aaronvanston

Any progress on this one? Only reason I ask is that in Redwood v4 (w yoga v3) these errors aren't masked anymore. Would be good to find a resolution for this so we don't have any error noise from expired tokens.

jonoc330 avatar Mar 22 '23 23:03 jonoc330

To deal with 500 when token is expired go and surround your jwt verification with try-catch block. Below NestJS Example:

try {
  const token = request.headers.authorization.split(' ')[1];
  const decoded = this.jwtService.verify(token);  // can throw 500 without try-catch
  // ...
  return true;
} catch (error) {
  throw new UnauthorizedException('Invalid token'); // 401
}

OchotaDariusz avatar Mar 29 '23 11:03 OchotaDariusz