opencost icon indicating copy to clipboard operation
opencost copied to clipboard

Opencost won't start with 400 "reason": "RESOURCE_PROJECT_INVALID" .

Open DD5HT opened this issue 1 year ago • 7 comments

Describe the bug

Opencost fails to start in GCP with a 400 "reason": "RESOURCE_PROJECT_INVALID" .

ERR Failed to lookup reserved instance data: googleapi: Error 400 │
│ opencost Details:                                                                                         │
│ opencost [                                                                                                │
│ opencost   {                                                                                              │
│ opencost     "@type": "type.googleapis.com/google.rpc.ErrorInfo",                                         │
│ opencost     "domain": "googleapis.com",                                                                  │
│ opencost     "metadatas": {                                                                               │
│ opencost       "method": "compute.v1.RegionCommitmentsService.AggregatedList",                            │
│ opencost       "service": "compute.googleapis.com"                                                        │
│ opencost     },                                                                                           │
│ opencost     "reason": "RESOURCE_PROJECT_INVALID"                                                         │
│ opencost   }                                                                                              │
│ opencost ]                                                                                                │
│ opencost , invalidParameter 

To Reproduce Steps to reproduce the behavior: Create API key with Compute and Billing API access.

Expected behavior It works.

Which version of OpenCost are you using? quay.io/kubecost1/kubecost-cost-model:prod-1.107.0-amd64@sha256:fc4b68f7c1d5d734c26ffffeff858100617fdc1a8c07827634cbfbce484d49f3

Additional context

I used terraform to generate the KEY and used it like this:

        - name: CLOUD_PROVIDER_API_KEY
          valueFrom:
            secretKeyRef:
              key: cloud-provider-api-key
              name: opencost

resource "google_project_service" "project" {
  project  = var.google_project_id
  service  = "cloudbilling.googleapis.com"
}

resource "google_project_service" "api_keys" {
  project  = var.google_project_id
  service  = "apikeys.googleapis.com"
}

resource "google_apikeys_key" "billing_key" {
  name         = "cloudbillingopencostkey"
  display_name = "cloudbilling-api-opencost-key"
  project      = var.google_project_id

  restrictions {
    api_targets {
      service = "cloudbilling.googleapis.com"
      methods = ["GET*"]
    }
    api_targets {
      service = "compute.googleapis.com"
      methods = ["GET*"]
    }
  }
  depends_on = [google_project_service.project, google_project_service.api_keys]
}

resource "kubernetes_secret_v1" "opencost_api_key" {
  metadata {
    name      = "opencost"
    namespace = "opencost"
  }

  data     = { "cloud-provider-api-key" = google_apikeys_key.billing_key.key_string }
}

DD5HT avatar Nov 22 '23 12:11 DD5HT

What is your resource project name ? Might have some character we don’t expect or consider invalid.

lmello avatar Nov 27 '23 20:11 lmello

This doesn't affect OpenCost working for Kubernetes pricing, I think it should be a WRN instead of ERR.

It may be a vestige of the Cloud Costs integration, we may need to get more specific on our logging messages to differentiate when Cloud Costs are enabled and things aren't working vs. disabled and shouldn't be complaining.

mattray avatar Nov 28 '23 05:11 mattray

What is your resource project name ? Might have some character we don’t expect or consider invalid.

@lmello no weird project name just the following schema: tenletters-tenletters

I'm also not sure where opencost gets the project name from.

DD5HT avatar Dec 06 '23 09:12 DD5HT

In any case, I'm surprised we're actually crashlooping on this. Let's get this prioritized to not crash at least cc @cliffcolvin

AjayTripathy avatar Dec 08 '23 04:12 AjayTripathy

I have a similar error but I am not sure how it affects the behavior of the application because Opencost UI seems to be working fine and I get some data about the costs. Maybe default costs are used?

I see another ERR message above the mentioned one WRN Failed to load auth secret, or was not mounted: Secret does not exist

Might those errors be connected?

ERR Failed to lookup reserved instance data: googleapi: Error 400: Invalid resource field value in the request.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.ErrorInfo",
    "domain": "googleapis.com",
    "metadatas": {
      "method": "compute.v1.RegionCommitmentsService.AggregatedList",
      "service": "compute.googleapis.com"
    },
  "reason": "RESOURCE_PROJECT_INVALID"
  }
]
, invalidParameter

OliverChmelicky avatar Feb 16 '24 07:02 OliverChmelicky

Did you have your CLOUD_PROVIDER_API_KEY set? You can do that from an environment variable or with the Helm chart. https://www.opencost.io/docs/configuration/gcp#configuring-gcp-pricing

mattray avatar Feb 21 '24 04:02 mattray

Yes. Without specifying cloudProviderApiKey when using Helm chart for managing Opencost, the instance is not able to run. Default value in Helm chart is empty string if managing with Helm. It is not mentioned in the docs examples for Helm that there needs to be this environment variable set.

On the other hand the quickstart demo manual uses Kubernetes manifest from this file with the comment The GCP Pricing API requires a key. This is supplied just for evaluation.

So to answer your question. Yes I had CLOUD_PROVIDER_API_KEY set with unrestricted privilege. I was using these Helm values.yml file:

opencost:
  exporter:
    cloudProviderApiKey: "<cloud_provider_api_key_value>"

Warnings and Errors I get are in this order:

  1. WRN Failed to load auth secret, or was not mounted: Secret does not exist
  2. ERR Failed to lookup reserved instance data: googleapi: Error 400: Invalid resource field value in the request.

OliverChmelicky avatar Feb 22 '24 02:02 OliverChmelicky

I've updated the docs to focus the required steps per-cloud provider. The GCP instructions include the CLOUD_PROVIDER_API_KEY steps: https://www.opencost.io/docs/configuration/gcp#add-the-gcp-api-key-to-opencost

mattray avatar Mar 12 '24 04:03 mattray

We are still seeing this error and I have verified that the API Key has correct access and the key is available as CLOUD_PROVIDER_API_KEY environment variable. In a similar environment with the same setup we do not get this error.

sfrolich avatar Apr 24 '24 15:04 sfrolich

@sfrolich What's different between your 2 environments?

mattray avatar Apr 24 '24 23:04 mattray

@sfrolich What's different between your 2 environments?

Nothing really except they are in 2 different GKE clusters. Both running the same version of GKE and same version of Opencost with the same configuration setup (API key setup through Secret/External Secret/API Key).

I checked my environment again today and the opencost container is no longer erroring out. Makes no sense that "eventually" it would start working. In the log I did see: DBG [Reserved] No Reserved Instances so it appears that it was able to check the reserved instances but there are none (we indeed do not have any reserved instances)

So @mattray what do you think could cause it to eventually succeed?

sfrolich avatar Apr 25 '24 15:04 sfrolich

I have the same issue. I correctly set the "cloudProviderApiKey" following the new doc. However, the following logs are displayed:

WRN Failed to load auth secret, or was not mounted: Secret does not exist. ERR Failed to lookup reserved instance data: googleapi: Error 400: Invalid resource field value in the request.

Details: [ { "@type": "type.googleapis.com/google.rpc.ErrorInfo", "domain": "googleapis.com", "metadatas": { "method": "compute.v1.RegionCommitmentsService.AggregatedList", "service": "compute.googleapis.com" }, "reason": "RESOURCE_PROJECT_INVALID" } ]

Note:

  1. The api key was created without limits
  2. I created the secret open-costs following the procedure and it is correctly mounted.

@mattray @cliffcolvin

stefano-arre avatar Apr 29 '24 12:04 stefano-arre

So @mattray what do you think could cause it to eventually succeed?

Really hard to tell. Is it possible someone updated permissions on a role between deployments?

mattray avatar May 01 '24 06:05 mattray

Could you try 1.110.0 and see if the issue persists? Since this one's been closed maybe we should start a new issue against 1.110.0 and we'll try to clear the logs out if it's working

mattray avatar May 01 '24 06:05 mattray

@mattray this is the resolution for me to remove [GET*] https://github.com/opencost/opencost-helm-chart/issues/87#issuecomment-1652318430

The error persists but the pod isn't constantly restarting

sfrolich avatar May 24 '24 21:05 sfrolich