google-auth-library-python icon indicating copy to clipboard operation
google-auth-library-python copied to clipboard

DefaultCredentialsError after Compute Engine Metadata server failures

Open Nishith95 opened this issue 3 years ago • 6 comments

Still seeing the same issue mentioned in https://github.com/googleapis/google-auth-library-python/issues/211

Environment details

  • OS: Linux/Container-Optimized OS
  • Python version: 3.8.5
  • pip version: 20.1.1
  • google-auth version: 1.33.0

Steps to reproduce

  1. cred, _ = default(scopes=scopes) fails after successfully running for a prolonged period

Context: I'm running a python multiprocessing service inside a docker container in Google's Container-Optimized OS (cos-stable-89-16108-470-1). The processes typically run bigquery queries that it consumes from pubsub topics and sends logs to other pubsub topics. The default authentication from the google-auth package works fine for some time (typically ~18 hours) while the processes continuously run bigquery commands and eventually fail on connecting to the Compute Engine Metadata server that results in DefaultCredentialsError (along with other google connection errors).

Tracebacks:

Jul 07 21:41:46 kronos-staging-eden-1 docker[208805]:     cred, _ = default(scopes=scopes)
Jul 07 21:41:46 kronos-staging-eden-1 docker[208805]:   File "/pyenv/versions/kronos-cloud-worker-deploy/lib/python3.8/site-packages/google/auth/_default.py", line 483, in default
Jul 07 21:41:46 kronos-staging-eden-1 docker[208805]:     raise exceptions.DefaultCredentialsError(_HELP_MESSAGE)
Jul 07 21:41:46 kronos-staging-eden-1 docker[208805]: google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:     table = client.get_table(
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:   File "/pyenv/versions/kronos-cloud-worker-deploy/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 994, in get_table
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:     api_response = self._call_api(
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:   File "/pyenv/versions/kronos-cloud-worker-deploy/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 741, in _call_api
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:     return call()
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:   File "/pyenv/versions/kronos-cloud-worker-deploy/lib/python3.8/site-packages/google/api_core/retry.py", line 285, in retry_wrapped_func
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:     return retry_target(
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:   File "/pyenv/versions/kronos-cloud-worker-deploy/lib/python3.8/site-packages/google/api_core/retry.py", line 188, in retry_target
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:     return target()
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:   File "/pyenv/versions/kronos-cloud-worker-deploy/lib/python3.8/site-packages/google/cloud/_http.py", line 473, in api_request
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:     response = self._make_request(
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:   File "/pyenv/versions/kronos-cloud-worker-deploy/lib/python3.8/site-packages/google/cloud/_http.py", line 337, in _make_request
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:     return self._do_request(
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:   File "/pyenv/versions/kronos-cloud-worker-deploy/lib/python3.8/site-packages/google/cloud/_http.py", line 375, in _do_request
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:     return self.http.request(
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:   File "/pyenv/versions/kronos-cloud-worker-deploy/lib/python3.8/site-packages/google/auth/transport/requests.py", line 476, in request
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:     self.credentials.before_request(auth_request, method, url, request_headers)
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:   File "/pyenv/versions/kronos-cloud-worker-deploy/lib/python3.8/site-packages/google/auth/credentials.py", line 133, in before_request
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:     self.refresh(request)
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:   File "/pyenv/versions/kronos-cloud-worker-deploy/lib/python3.8/site-packages/google/auth/compute_engine/credentials.py", line 117, in refresh
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:     six.raise_from(new_exc, caught_exc)
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]:   File "<string>", line 3, in raise_from
Jul 16 13:26:30 kronos-staging-eden1-1 docker[930391]: google.auth.exceptions.RefreshError: Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Enginemetadata service. Compute Engine Metadata server unavailable

This previously reported issue (https://github.com/googleapis/google-auth-library-python/issues/211) has mentioned the same errors but the thread was closed after pushing fixes to later versions. Hence opening a new issue now since i'm still seeing these errors in the latest version.

Nishith95 avatar Jul 22 '21 15:07 Nishith95

I have seen this as a recommended solution but this is not possible for us at this time. Looking for a fix to the root cause here (metadata server issues) https://github.com/googleapis/google-auth-library-python/issues/211#issuecomment-369665130

Nishith95 avatar Jul 22 '21 15:07 Nishith95

I am not sure what the latest recommendations are around the Metadata server - @arithmetic1728 @silvolu Could you take a look?

busunkim96 avatar Jul 22 '21 19:07 busunkim96

Any update here?

Nishith95 avatar Sep 15 '21 22:09 Nishith95

Any update regarding this issue?

mazzi avatar Mar 30 '22 20:03 mazzi

Probably you can create a global credential using cred, _ = default(scopes=scopes) once, and pass the cred to all clients that use this credential (instead of letting the clients creating their own). This will reduce the load on metadata server. Probably this can solve the issue.

arithmetic1728 avatar Mar 30 '22 23:03 arithmetic1728

Probably you can create a global credential using cred, _ = default(scopes=scopes) once, and pass the cred to all clients that use this credential (instead of letting the clients creating their own). This will reduce the load on metadata server. Probably this can solve the issue.

Good one. Thanks @arithmetic1728

mazzi avatar Mar 31 '22 13:03 mazzi