esp-v2 icon indicating copy to clipboard operation
esp-v2 copied to clipboard

Frequent error logs on GCP API Gateway

Open erhan-talarian opened this issue 3 years ago • 5 comments

Hi, I have an App Engine Standard Java 11 application served behind a GCP API Gateway. When I check the Cloud Logs I frequently see these errors like the following from API Gateway:

{
insertId: "5e1e0f456172eda03520c9fae065f312-1@a1"
jsonPayload: {
api: "//apigateway.googleapis.com/projects/abc/locations/global/apis/abc"
apiConfig: "//apigateway.googleapis.com/projects/abc/locations/global/apis/abc/configs/abc"
httpRequest: {
duration: "0ms"
hostname: "servicecontrol.googleapis.com"
httpVersion: "HTTP/1.1"
path: "/v1/services/abc.apigateway.abc.cloud.goog:report"
requestSize: "5003"
responseSize: "95"
status: 503
}
serviceConfig: "//servicemanagement.googleapis.com/services/abc.apigateway.abc.cloud.goog/configs/abc"
}
logName: "projects/abc/logs/apigateway.googleapis.com%2Fservice_control_queries"
receiveTimestamp: "2022-06-20T13:19:37.047238698Z"
resource: {
labels: {
gateway_id: "abc"
location: "us-central1"
resource_container: "projects/abc"
}
type: "apigateway.googleapis.com/Gateway"
}
severity: "ERROR"
timestamp: "2022-06-20T13:19:29.777308345Z"
}

What do these errors mean, should I be concerned?

Thank you

erhan-talarian avatar Jun 21 '22 08:06 erhan-talarian

This means Google's service control service is down(503). How often do you see it? Is it good now?

TAOXUY avatar Jun 21 '22 15:06 TAOXUY

I checked the logs from last 14 days and 9 of them has occurrences of this error. Minimum is 1 error per day and max is 65 errors per day (API average is 3 requests / sec). Normally I wouldn't be bothered but I suspect some requests somehow never reach the App Engine application (no logs/errors) and API GW responds with HTTP 500 (sometimes 502) directly, that's why I am a little bit concerned.

erhan-talarian avatar Jun 21 '22 17:06 erhan-talarian

For the errors with path = "/v1/services/abc.apigateway.abc.cloud.goog:report", these are less severe than path = "/v1/services/abc.apigateway.abc.cloud.goog:check". The former is called at the end of requests,to send telemetry info, their failures cause the request data not showing in the graph, not showing access logs.

The latter is for checking access control, such as api-key, their failures may reject the requests and cause the requests not reaching to your app-engine applications, could you check to see if there are such errors.

qiwzhang avatar Jun 21 '22 18:06 qiwzhang

The error for failing calling /v1/services/abc.apigateway.abc.cloud.goog:report happens on log path not request path which won't effect your service availability, so it shouldn't some other problems if the requests are not forwarded to your backend.

The SLO for report is 99.9%. I think 65 per day for 3QPS looks good to me.

TAOXUY avatar Jun 21 '22 18:06 TAOXUY

There are also errors for "/v1/services/abc.apigateway.abc.cloud.goog:check", but less frequent (3 days out of last 14, max 17 errors/day). However, the timings for these errors don't match the requests I suspect of not reaching App Engine, so I guess they are "innocent".

erhan-talarian avatar Jun 21 '22 19:06 erhan-talarian