terraform-provider-grafana
terraform-provider-grafana copied to clipboard
Grafana Provider Error - "Set the auth and url provider attributes"
Terraform Version
- Terraform: 1.5.2
- Terraform Grafana Provider: 1.42.0
- Grafana: 9.4
Affected Resource(s)
Error raised by Grafana Provider directly
provider "grafana" {
url = "https://${module.core_infra.grafana_workspace_endpoint}"
auth = module.core_infra.grafana_api_key
}
Where core_infra is a module which instantiates an Amazon Managed Grafana instance using terraform-aws-modules/managed-service-grafana
Resources being deployed:
- grafana_data_source
- grafana_folder
- grafana_dashboard
Terraform Configuration Files
Working on an example I can share.
Debug Output
Working on an example I can share the full output from. An example error is as below:
{
"@level": "error",
"@message": "Error: the Grafana client is required for `grafana_folder`. Set the auth and url provider attributes",
"@module": "terraform.ui",
"@timestamp": "2023-06-29T13:23:12.126952Z",
"diagnostic": {
"severity": "error",
"summary": "the Grafana client is required for `grafana_folder`. Set the auth and url provider attributes",
"detail": "",
"address": "module.usecase_module_one.module.generic_grafana_module.grafana_folder.usecase_folder",
"range": {
"filename": ".terraform/modules/usecase_module_one.generic_grafana_module/grafana.tf",
"start": { "line": 25, "column": 44, "byte": 917 },
"end": { "line": 25, "column": 45, "byte": 918 }
},
"snippet": {
"context": "resource \"grafana_folder\" \"usecase_folder\"",
"code": "resource \"grafana_folder\" \"usecase_folder\" {",
"start_line": 25,
"highlight_start_offset": 43,
"highlight_end_offset": 44,
"values": []
}
},
"type": "diagnostic"
}
Panic Output
N/A
Expected Behavior
We use "usecase" modules and have the following structure in our Terraform workspace: Workspace
- core_infra module (contains Amazon Managed Grafana)
- Grafana Provider (using output from core_infra)
- usecase_module_one
- generic_grafana_module (contains Grafana resources grafana_data_source & grafana_folder, grafana_dashboard)
- usecase_module_two
- generic_grafana_module (contains Grafana resources grafana_data_source & grafana_folder, grafana_dashboard)
The Grafana Provider is created using the output of the core_infra module, specifically module.core_infra.grafana_workspace_endpoint and module.core_infra.grafana_api_key to configure the Provider with the "url" and "auth" parameters.
We then add a new "usecase_module" using the same underlying "generic_grafana_module" as follows:
Current Workspace
- core_infra module
- grafana provider
- usecase_module_one
- generic_grafana_module
- usecase_module_two
- generic_grafana_module
- usecase_module_three
- generic_grafana_module
This should update existing infra as required and add the grafana resources for "usecase_module_three"
Actual Behavior
When running the plan and apply to add "usecase_module_three" to our environment we get a Grafana Provider error only on resources related to "usecase_module_one" and "usecase_module_two". It successfully plans for the "usecase_module_three" deployment.
An example error is as below:
{
"@level": "error",
"@message": "Error: the Grafana client is required for `grafana_folder`. Set the auth and url provider attributes",
"@module": "terraform.ui",
"@timestamp": "2023-06-29T13:23:12.126952Z",
"diagnostic": {
"severity": "error",
"summary": "the Grafana client is required for `grafana_folder`. Set the auth and url provider attributes",
"detail": "",
"address": "module.usecase_module_one.module.generic_grafana_module.grafana_folder.usecase_folder",
"range": {
"filename": ".terraform/modules/usecase_module_one.generic_grafana_module/grafana.tf",
"start": { "line": 25, "column": 44, "byte": 917 },
"end": { "line": 25, "column": 45, "byte": 918 }
},
"snippet": {
"context": "resource \"grafana_folder\" \"usecase_folder\"",
"code": "resource \"grafana_folder\" \"usecase_folder\" {",
"start_line": 25,
"highlight_start_offset": 43,
"highlight_end_offset": 44,
"values": []
}
},
"type": "diagnostic"
}
We get an error like this for each Grafana resource in "usecase_module_one" and "usecase_module_two"
Steps to Reproduce
- Define modules as per the structure in "Expected Behaviour" with only "core_infra", "usecase_module_one", "usecase_module_two"
- Deploy this with
terraform plan&terraform apply - Add "usecase_module_three" to the terraform code
terraform plan- Will fail as per behaviour in "Actual Behaviour"
Important Factoids
N/A
References
None
If you inspect your state, are both the module.core_infra attributes you're using set?
Hi @julienduchesne,
I checked the state file and could find the values referenced by those core_infra outputs in the infrastructure. I.e. I could find the module.core_infra.module.managed_grafana.endpoint which is mapped in the core_infra/outputs.tf as module.core_infra.grafana_workspace_endpoint.
Just to be 100% sure, I stored module.core_infra.grafana_workspace_endpoint and module.core_infra. grafana_api_key outputs as terraform_data resources which I then referenced in the Grafana Provider as below:
provider "grafana" {
url = terraform_data.workspace_endpoint_url.output
auth = terraform_data.workspace_key.output
}
I could clearly see these terraform_data values in the state file, see an extract below:
{
"mode": "managed",
"type": "terraform_data",
"name": "workspace_endpoint_url",
"provider": "provider[\"terraform.io/builtin/terraform\"]",
"instances": [
{
"schema_version": 0,
"attributes": {
"id": "62eb70b9-a57a-af7e-67c4-048d26451738",
"input": {
"value": "https://g-28jd182fk9.grafana-workspace.us-east-1.amazonaws.com",
"type": "string"
},
"output": {
"value": "https://g-28jd182fk9.grafana-workspace.us-east-1.amazonaws.com",
"type": "string"
},
"triggers_replace": null
},
"sensitive_attributes": [],
"dependencies": [
"module.core_infra.aws_iam_role.grafana_service_role",
"module.core_infra.module.managed_grafana.aws_grafana_workspace.this",
"module.core_infra.module.managed_grafana.aws_iam_role.this",
"module.core_infra.module.managed_grafana.aws_security_group.this",
"module.core_infra.module.managed_grafana.data.aws_iam_policy_document.assume",
"module.core_infra.module.managed_grafana.data.aws_partition.current",
"module.core_infra.module.managed_grafana.data.aws_subnet.this"
]
}
]
},
Still had the same Provider error.
Another thing to note is that I also tried an plan & apply using the existing TF (without"usecase_module_three" being added) and the plan fails in the same manner as the original issue comment.
If I hardcode a dummy Grafana workspace URL and API Key,
provider "grafana" {
url = "https://grafana.example.com"
auth = "somekey"
}
Then the plan fails with an error
│ Error: Get "https://grafana.example.com/api/folders?limit=1000&page=1": dial tcp: lookup grafana.example.com on 10.184.0.2:53: no such host
│
│ with module.usecase_module_one.module.inference_infra.grafana_folder.usecase_folder,
│ on .terraform/modules/usecase_module_one.inference_infra/grafana.tf line 25, in resource "grafana_folder" "usecase_folder":
│ 25: resource "grafana_folder" "usecase_folder" {
A further note, I retried the plan & apply with existing TF (no changes) and hardcoded different dummy Grafana workspace URL and API key and then the plan passed?!
Specifically I used:
provider "grafana" {
url = "https://example.com"
auth = "somekey"
}
A plan of an unapplied resource will not do any remote calls
Thanks @julienduchesne. If you read the message before that one, then it is making calls during the plan phase as the provider is erroring saying it's unable to reach the Grafana API endpoint.
Thanks @julienduchesne. If you read the message before that one, then it is making calls during the
planphase as the provider is erroring saying it's unable to reach the Grafana API endpoint.
Yes. If it's doing a remote call during a plan, it means it's doing a refresh of a resource that was previously applied
I'm also having this issue. I have three environments that are managed via the same code using terraform workspaces. 2 of them fail with this same error, and the other one successfully. All three were created at similar times and should have valid states, as they are managed by TF cloud and all had passing runs on their last apply before this issue popped up.
The code for our deployments is heavily influenced by the example docs.
Here are the relevant resources:
# Declaring the first provider to be only used for creating the cloud-stack
provider "grafana" {
alias = "first"
}
# Declaring the second provider to be used for creating resources in Grafana
provider "grafana" {
alias = "second"
url = grafana_cloud_stack.target_env.url
auth = grafana_api_key.importer.key
}
resource "grafana_cloud_stack" "target_env" {
provider = grafana.first
name = var.environment_name
slug = "<redacted>${var.environment_name}"
region_slug = "us" # Example “us”,”eu” etc
url = "https://${var.environment_name}.<redacted>"
}
# Creating an API key in Grafana instance to be used for creating resources in Grafana instance
resource "grafana_api_key" "importer" {
cloud_stack_slug = grafana_cloud_stack.target_env.slug
name = "importer"
role = "Admin"
}
resource "grafana_folder" "target_folder" {
provider = grafana.second
title = "target folder"
}
I was originally using version 1.28.0, but after encountering this, I tried upgrading to 1.43.0 and still have the same issue.
If I put API keys directly into the "auth" fields of my providers, a plan at least works.
Update: if I invoke terraform with the -refresh=false flag my apply works.
So to recap:
- This is an existing deployment which has worked without issue previously.
- When trying to generate a plan I receive a
Error: the Grafana client is required for 'grafana_folder'. Set the auth and url provider attributeserror. - I can run
terraform apply -refresh=falsesuccessfully. - I can also run
terraform plan -refresh=falsesuccessfully. - Even after a successful apply, I continue to get auth errors any time I do not use the
-refresh=falseflag.
Getting this too. From grafana_folder
Need to test whether you can use the provider fine as long as you don't create any folders.
the refresh=false workaround won't work long term for us as we can't use that in CI
Also seems to be present when using the grafana_folder data source
@julienduchesne hey! let me kn ow if i can help you out with this - causing us a fair amount of hell in CI/CD
This issue is hard to remediate because in all the cases I've managed to reproduce, it's always that either the auth or URL are missing (as the message says). If I removed the error, you'd instead get a 401 error.
Here's an example: Folders and dashboards are managed by a service account token. That token is removed Grafana side. Terraform, on read, removes the token from state and so there's no auth anymore. The error triggers.
An ugly fix could be setting a depends_on condition for all resources that depend on previous resources for auth. For example, folders and dashboards would have a depends_on condition on the service account token resource that creates the auth used in their provider
Hi @julienduchesne - gave depends_on a go but i still get this error
Why would the service account token be removed from inside grafana? (we don't touch them...) i do see that my token is expired one - so maybe that could be the root cause?
Could this be an issue with API keys migrating to service accounts?
I'm using Grafana cloud and can confirm that in some environments the "importer" key we create still shows up under API keys, but in other deployments the API key tab is gone, and there is only a service accounts tab. In environments that only have the "service accounts" tab, it looks like the previous "importer" API key was upgraded.
Perhaps the root cause of this issue is provisioning an API key and then later doing the in-browser upgrade to service accounts? Once that latter step has happened, the deployments break since the API keys no longer exist?
We've only ever used service accounts. (only deployed this infra recently)
This issue is essentially https://discuss.hashicorp.com/t/depends-on-in-providers/42632.
Github issues:
Not sure there's anything we can do here. If a resource is being planned by a provider instance for which the auth is not in the state anymore (for any of many Terraform reasons), it is going to fail because Terraform provides an empty string for the auth
Users can get around that in a few ways:
- Using multiple projects for their definitions. One project/dir has the service account and the other project/dir has the resources being applied with that SA. The token can be read across projects through outputs or with an orchestration system like terragrunt
- Doing targeted (
-target) plans and applies of the SA (and tokens) before doing a fullterraform plan - Using
terraform plan -refresh=falseandterraform apply -refresh=falsebut that gets rid of one of the main features of TF which is drift reconciliation
OK. Is it possible that the token expiring is what takes it out of the state? I can't think of a reason that it'd leave our state otherwise.
If it is that, then I would imagine there is a case that in the grafana provider it handles the has_expired boolean and forces recreation, rather than removing it from the state.
I'm getting this on a brand new, un-applied workspace with no data lookups. I'm not making sense of this.
I create a grafana_team & grafana_folder and then get this error when trying grafana_folder_permissions_item
I'm getting this on a brand new, un-applied workspace with no data lookups. I'm not making sense of this.
I create a
grafana_team&grafana_folderand then get this error when tryinggrafana_folder_permissions_item
This is a different one @NickAdolf. Here it is: https://github.com/grafana/terraform-provider-grafana/issues/1485. It will be fixed in next release. Sorry about that!
We are also facing this issue. It seems to be blocked by https://github.com/hashicorp/terraform/issues/2430, but are there other workarounds?
This issue is persisting for me on
data "grafana_dashboard" "this" {
uid = var.uid
}
Was having the same issue until I solved it by adding an alias to the provider and referencing it in the resource:
provider "grafana" {
alias = "bare-metal" <--- ADD THIS
url = "https://YOUR_ENDPOINT"
auth = var.grafana_service_account_token //generated manually via Grafana UI
}
resource "grafana_data_source" "loki" {
is_default = true
provider = grafana.bare-metal <--- ADD THIS
type = "loki"
name = "Loki"
url = "http://loki-gateway.monitoring.svc.cluster.local"
lifecycle {
ignore_changes = [json_data_encoded, http_headers]
}
}
resolution/workaround: one comment (from one of the maintainers) clarifies that the root cause is indeed terraform's lack of supporting dependencies on providers attributes.
before the issue arose, the team had created a service account, token and a grafana dashboard (Amazon Managed Grafana) without problems. (i guess on the first plan and apply, the grafana provider may behave different/more robust against missing provider dependency since no resource exists?). The issue arose once the token expired after 30 days (as per configuration).
Resolution steps:
- upgrade the grafana provider (to v3.15.3) and used FOSS module (terraform-aws-modules/managed-service-grafana/aws to 2.2.0) to lates
- remove the relevant stale resources (expired service account/token and downstream dependent resources) (both infrastructure and tf state (direct removal on state)
- make two consecutive plan and apply cycles (first to get a new service account and token), second for the provider to use the new credential for creating any arbitrary grafana provider resource.
Likely the issue arises after the configured expiry time of the token associated to the service account. Repeating steps 2 and 3 should help you recover again. Unf. you may have to deal with data loss of your grafana resources or try to integrate your backup/restore procedures into the above steps.
(it's important to note that in this scenario we had a quite large monolithic root stack that had eks, helm, aws, grafana providers active. Carving out just the grafana resources and modules into an own root stack would have been to invasive to the current design and cicd implantation.
I'm going to close this as it is an issue with Terraform itself and not the provider.
https://github.com/hashicorp/terraform-provider-aws/issues/43465 would solve this as well