gax-go icon indicating copy to clipboard operation
gax-go copied to clipboard

APIError.GRPCStatus returns "OK" for all http errors.

Open ribrdb opened this issue 2 years ago • 19 comments

When using apierror.FromError with an http error, the status field does not get populated, so the GRPCStatus() method returns nil. It seems like this is intended to mean "no status is available". However, the status package treats nil as "OK": https://github.com/grpc/grpc-go/blob/dba26e15a07f43875ccf806a2dd6cbcbc1c12eab/internal/status/status.go#L72

The status package also says that no non-nil error should have status of "OK". To properly fulfil this contract, it seems that the APIError.status should never be nil. I would expect it to at least return UNKNOWN, but it looks like the json error may also include a status field with a grpc status code, and some of the http status code seem to map directly to grpc codes (e.g. 404, 429, 503): https://cloud.google.com/apis/design/errors#handling_errors

ribrdb avatar Feb 16 '23 21:02 ribrdb

@ribrdb The apierror.APIError type returned by apierror.FromError is intended to be used either with gRPC or HTTP. When used with HTTP, the status field should be ignored, and instead the httpErr should be regarded. The type is currently described as follows:

APIError wraps either a gRPC Status error or a HTTP googleapi.Error. It implements error and Status interfaces.

Do you think this description could be improved to make the situation clearer?

quartzmo avatar Feb 24 '23 00:02 quartzmo

How are we supposed to tell which one to use? It seems incorrect to say that it implements the "Status" interface. The status interface should never return nil from the GPRCStatus method unless the error is nil. Any code that checks for a GRPCStatus method has no way to know that the method is broken because this is an http error.

ribrdb avatar Feb 24 '23 00:02 ribrdb

Does the application code not have the context of whether the transport is gRPC or HTTP?

quartzmo avatar Feb 27 '23 18:02 quartzmo

I don't know how we would. Some google apis are returning errors wrapped with APIError. Is our code supposed to know whether that api is using grpc or rest? The main problem is that we have generic application code that uses the status.Code() method on errors that could be from anywhere. If this is an APIError wrapping an http error, we get OK and our code thinks there was no error. If we got at least UNKNOWN our code would not be broken, but it would certainly be preferable to use the status code that's in the http error if it's included.

ribrdb avatar Feb 27 '23 18:02 ribrdb

Some google apis are returning errors wrapped with APIError.

Client libraries (not APIs) return errors wrapped with APIError. If you can list some of the client libraries you are using, I may be able to answer whether that client is using grpc or rest.

we have generic application code that uses the status.Code() method on errors that could be from anywhere.

As you know, status.Code is from gRPC-Go. It is gRPC-specific. If you have generic application code that handles errors from both gRPC and HTTP clients, is there any way to make this context aware of which type of client is returning the current error? In that case, you could only check for status.Code if the originating client is gRPC.

quartzmo avatar Feb 27 '23 20:02 quartzmo

Well the way we checked before is to see if the error has a GRPCStatus method. But your error objects have a GRPCStatus method that returns something that method should never return. You are returning a nil status object. A nil status object means "no error ocurred" not "this wasn't a grpc error".

ribrdb avatar Feb 27 '23 20:02 ribrdb

Thanks, I think your usage makes sense and is clear now. I'll see what can be done.

quartzmo avatar Feb 27 '23 20:02 quartzmo

I'm going to change the title of this issue to indicate that apierror.APIError should support querying whether it wraps a gRPC Status error or a HTTP googleapi.Error. Let me know however if you think there is some other issue to address here.

quartzmo avatar Feb 27 '23 20:02 quartzmo

I don't think adding a way to query solves the problem. Then code still has to be aware of apierror.APIError. Existing code that just uses the grpc status won't know that it has to query before it can use any of the methods in the grpc status package. It seems like either apierror.ParseError should return an object that has no GRPCStatus method for non-grpc errors, or else the GRPCStatus method should actually follow the conventions of the status package and never return nil.

Why can't setDetailsFromError do

case isHTTPErr:
		a.httpErr = herr
		a.details = parseHTTPDetails(herr)
		a.status = status.New(codes.UNKNOWN, herr.Message)

Or have parseHTTPDetails return the value of e.GetError().GetStatus() if it's there?

ribrdb avatar Feb 27 '23 20:02 ribrdb

The current implementation of apierror.APIError makes it so that status.Code and status.FromError are broken. Either apierror.APIError should be updated so that the status package isn't broken, or the status package should be udpated to handle GRPCStatus returning nil. But it seems to me that grpc-go shouldn't have to update their code because you want to redefine how the GRPCStatus method behaves.

ribrdb avatar Feb 27 '23 21:02 ribrdb

Hey @ribrdb, @quartzmo will go ahead with your suggested fix here.

Would mind sharing which API clients you are using? I just want to learn more, that is all. Obviously there is a gap in how we anticipated this being used by end users and how it actually is. Specifically, when you said the following:

Some google apis are returning errors wrapped with APIError. Is our code supposed to know whether that api is using grpc or rest?

My first thought was "Yes, this should be known to the user". Now I realize that we were wrong in making that assumption, and this is definitely not the fault of the user.

Some more thinking/context you may (or may not) care about: With the clients in cloud.google.com/go, one must explicitly change their code to use HTTP/JSON (e.g. language.NewRESTClient). All of the clients in google.golang.org/api use HTTP/JSON. So the thinking was that if one were to invoke a one of these clients and inspect the errors, they'd be explicitly changing/writing code to invoke an HTTP/JSON API and would not use gRPC Status APIs to inspect the error. The naivety in our thinking here I'd say is that there could be some abstraction that uses multiple clients with varying transports, bubbling up errors to a centralized handler that needs to inspect the error type to know how to handle it.

noahdietz avatar Feb 27 '23 22:02 noahdietz

Here's some background context on our side. We've been using various Google Cloud apis for many years. So I guess we've gotten used to thinking "Google apis return grpc errors." Our own internal apis also use grpc errors. So that's what our team is used to. We've more recently started using the Google Classroom api. At first we were surprised it wasn't returning the same type of errors. In some of our code we do remember to specially handle the http errors, but in other cases the http error is getting returned directly to other code, and was implicitly relying on this getting translated to codes.UNKNOWN by the status package.

When we found the apierror package we tried wrapping the http errors from Google Classroom so we could get access to the error details for debugging. But this caused many things to break because the status changed from UNKNOWN to OK.

It also seems like classroom is returning grpc statuses because we see the status code names in the error messages. It would be nice if we could access that directly because the grpc ones seem more specific, but the googleapi.Error only exposes the http status code, even if the response contains a grpc status.

ribrdb avatar Feb 28 '23 01:02 ribrdb

@ribrdb You're using google.golang.org/api/classroom/v1?

quartzmo avatar Feb 28 '23 16:02 quartzmo

Yes.

I just want to mention what has been the most confusing thing for us. When we use the classroom api, we see errors that look a lot like grpc errors. For example, they say "permission denied" instead of "forbidden". But then we can't handle them using the concise if status.Code(err) == codes.PermissionDenied . Instead we have to remember to cast them to a googleapi.Error and check for http.StatusForbidden. Or for example right now we're getting errors in the logs that say failed precondition. I don't even know how we're supposed to handle that in our go code because the http code 400 is used for multiple different errors. Maybe we have to look at the error details? It would certainly be a lot easier if status.Code just returned codes.FailedPrecondition.

ribrdb avatar Feb 28 '23 17:02 ribrdb

The fix in #260 has been released in 2.7.1.

quartzmo avatar Mar 09 '23 16:03 quartzmo

We've been hit by this too, this becomes nasty when we make use of error unwrapping. Our code looks like this:

var res interface { GRPCStatus() *status.Status }
if errors.As(err, &res) {
  return res.GRPCStatus()
}

fsaintjacques avatar Apr 12 '23 16:04 fsaintjacques

I think we might need to look into conditionally supporting the GRPCStatus interface i.e. removing the method from the APIError surface and embedding the APIError in a private type that implements that interface only when it is actually a gRPC Status related error. So we don't break existing errors.As|Is or status.FromError usage, but don't confuse HTTP errors for gRPC.

That said, users will always need to adjust their error handling based on when they are using HTTP or gRPC as a transport.

noahdietz avatar Apr 12 '23 17:04 noahdietz

@codyoss wdyt?

noahdietz avatar Apr 12 '23 17:04 noahdietz

I think this is better now with grpc 1.55: https://github.com/grpc/grpc-go/releases/tag/v1.55.0

"status: status.Code and status.FromError handle wrapped errors"

codyoss avatar Jul 05 '23 19:07 codyoss