gcp-metadata icon indicating copy to clipboard operation
gcp-metadata copied to clipboard

Consider option to bypass Metadata-Flavor response header check

Open nexdrew opened this issue 7 months ago • 3 comments

What would you like to see in the library?

My team recently experienced a problem in one of our applications using the @google-cloud/secret-manager package where secret lookups starting failing with the following error (originating from this library):

/var/opt/runbot/api/node_modules/@google-cloud/secret-manager/node_modules/gcp-metadata/build/src/index.js:121
            throw new Error(`Invalid response from metadata service: incorrect ${exports.HEADER_NAME} header.`);
                  ^
Error: Invalid response from metadata service: incorrect Metadata-Flavor header.
    at metadataAccessor (/var/opt/runbot/api/node_modules/@google-cloud/secret-manager/node_modules/gcp-metadata/build/src/index.js:121:19)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async GoogleAuth._GoogleAuth_getUniverseFromMetadataServer (/var/opt/runbot/api/node_modules/@google-cloud/secret-manager/node_modules/google-auth-library/build/src/auth/googleauth.js:791:26)
    at async GoogleAuth.getUniverseDomain (/var/opt/runbot/api/node_modules/@google-cloud/secret-manager/node_modules/google-auth-library/build/src/auth/googleauth.js:190:168)
    at async GoogleAuth.getApplicationDefaultAsync (/var/opt/runbot/api/node_modules/@google-cloud/secret-manager/node_modules/google-auth-library/build/src/auth/googleauth.js:256:42)
    at async GoogleAuth.getClient (/var/opt/runbot/api/node_modules/@google-cloud/secret-manager/node_modules/google-auth-library/build/src/auth/googleauth.js:678:17)
    at async GrpcClient._getCredentials (/var/opt/runbot/api/node_modules/@google-cloud/secret-manager/node_modules/google-gax/build/src/grpc.js:145:24)
    at async GrpcClient.createStub (/var/opt/runbot/api/node_modules/@google-cloud/secret-manager/node_modules/google-gax/build/src/grpc.js:308:23)

Upon investigation, we discovered the following:

  1. Our application was using version 5.0.1 of @google-cloud/secret-manager, which has a transitive dependency on google-auth-library (via the google-gax dependency) with a semver range of ^9.0.0.
  2. The google-auth-library changed in version 9.10.0 from requesting a universe_domain metadata endpoint (underscore) to a universe-domain endpoint (dash). See changes to the src/auth/googleauth.ts module.
  3. The GCP metadata service for our GCE hosts get different responses for the universe_domain (legacy) endpoint in different zones. We had one GCE instance in zone europe-west1-b receiving a 404 HTML response WITH the necessary Metadata-Flavor: Google response header, and one GCE instance in europe-west1-c receiving a 200 text response WITHOUT the necessary Metadata-Flavor: Google response header. Why this is the case, I have no idea.

The proper fix is to upgrade our google-auth-library transitive dependency (within our package-lock.json tree) to a version >= 9.10.0, but attempting to do this by upgrading our direct dependency on @google-cloud/secret-manager to its latest version (5.6.0 as of this writing) DID NOT change the versions of the transitive dependencies google-auth-library or google-gax in our package-lock.json tree, supposedly because the declared semver range for their dependencies (@google-cloud/[email protected] declares "google-gax": "^4.0.3" and [email protected] declares "google-auth-library": "^9.3.0") was still satisfied by the versions already included in our tree. This is a bit frustrating but is easily worked around by doing npm i google-auth-library@^9.0.0 and then npm uninstall google-auth-library.

However, due to the mysterious nature of the GCP metadata service and its unknown release/update cycle and apparent potential for breaking changes between different zones within the same GCP region at the same time, it would have been quicker and easier for us to resolve our broken application if there was some way to tell this library (gcp-metadata) to log a warning when the metadata service response is missing an expected header rather than have it always throw an error.

My proposal is to introduce an environment variable (e.g. GCE_METADATA_ALLOW_MISSING_HEADER=true) that would allow for this behavior.

It should obviously only be needed/utilized in rare scenarios, but since there seems to be inconsistency between the metadata service on different hosts in different zones, this would allow for a safety hatch if an unannounced breaking change to the metadata service is deployed.

Note that when I made this change manually to the installed version of this library on our application host, everything else worked as expected, including successful secret lookups from the Secret Manager.

Describe alternatives you've considered

No response

Additional context/notes

No response

nexdrew avatar Mar 13 '25 20:03 nexdrew