terraform-provider-azurerm
terraform-provider-azurerm copied to clipboard
Application Gateway request_routing_rule order change even with Azurerm 3.0.2
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Terraform (and AzureRM Provider) Version
Terraform v1.1.7 on windows_amd64
- provider registry.terraform.io/hashicorp/azurerm v3.0.2
Affected Resource(s)
-
azurerm_application_gateway
Terraform Configuration Files
dynamic request_routing_rule {
for_each = var.application_gateway_request_routing_rule
content {
http_listener_name = request_routing_rule.value["http_listener_name"]
name = request_routing_rule.value["name"]
redirect_configuration_name = request_routing_rule.value["redirect_configuration_name"]
rule_type = request_routing_rule.value["rule_type"]
backend_address_pool_name = request_routing_rule.value["backend_address_pool_name"]
backend_http_settings_name = request_routing_rule.value["backend_http_settings_name"]
url_path_map_name = request_routing_rule.value["url_path_map_name"]
}
}
{
http_listener_name = "HTTP-DEV-XXX-LISTENER"
name = "XXX-DEV-HTTPS-REDIRECT-RULE"
redirect_configuration_name = "XXX-DEV-HTTPS-REDIRECT"
rule_type = "Basic"
backend_address_pool_name = null
backend_http_settings_name = null
url_path_map_name = null
},
Debug Output
- request_routing_rule {
- http_listener_id = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxx/httpListeners/HTTP-DEV-XXX-LISTENER" -> null
- http_listener_name = "HTTP-DEV-XXX-LISTENER" -> null
- id = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxx/requestRoutingRules/XXX-DEV-HTTPS-REDIRECT-RULE" -> null
- name = "XXX-DEV-HTTPS-REDIRECT-RULE" -> null
- priority = 0 -> null
- redirect_configuration_id = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxx/redirectConfigurations/XXX-DEV-HTTPS-REDIRECT" -> null
- redirect_configuration_name = "XXX-DEV-HTTPS-REDIRECT" -> null
- rule_type = "Basic" -> null
}
+ request_routing_rule {
+ http_listener_id = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxx/httpListeners/HTTP-DEV-XXX-LISTENER"
+ http_listener_name = "HTTP-DEV-XXX-LISTENER"
+ id = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxx/requestRoutingRules/XXX-DEV-HTTPS-REDIRECT-RULE"
+ name = "XXX-DEV-HTTPS-REDIRECT-RULE"
+ redirect_configuration_id = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxx/redirectConfigurations/XXX-DEV-HTTPS-REDIRECT"
+ redirect_configuration_name = "XXX-DEV-HTTPS-REDIRECT"
+ rule_type = "Basic"
}
Panic Output
Expected Behaviour
No change detected, I also tried to run terraform apply, and it does somehow "modify" the application gateway, but if I run terraform plan again I still have the same issue.
Actual Behaviour
Terraform tries to change the order of request_routing_rules only (and all of them, I only provided you one sample output since we have many of them on this app gateway). It keeps happening even after a terraform apply.
Steps to Reproduce
- Configure request_routing_rule using dynamic blocks as per the above code example in an application gateway
-
terraform plan
- you will see the attempted change -
terraform apply
- terraform will apply the change even though there is no difference -
terraform plan
- terraform still tries to do the same changes
Important Factoids
Some time ago when this issue was not known yet, I remember that I tried to create a new application gateway from scratch and the issue was not there, but after some months between various changes, the issue appeared again randomly and never went away. I don't know what is causing the issue but we have it on 3 different application gateways and can't get rid of it.
References
- #6896 - marked as fixed but I still see this issue with the latest version of azurerm
Just wanted to add that on another application gateway, the issue is on backend_http_settings, so I assume this is just randomly happening on all the blocks. Not sure why one or the other is specifically affected though in each separate app gateway. But for this specific case, there was actually a difference in the http settings and once I fixed that (in the code only by aligning to what was in the portal) and ran the plan again no infrastructure changes were detected.
I still believe that according to the new version of azurerm I should've seen only the change in one of the backend_http_settings and not the addition and removal of all of them.
I am observing this issue still in probe
and backend_http_settings
blocks with the 3.0.2 provider. Also, fix is required to skip ordering change for path_rule
block under url_path_map
.
Hey @owaisaamir, do you mind posting the application gateway configuration you're using that is causing a plan diff?
Hi @mbfrahry, I am using using dynamic with sets for probe
, backend_http_settings
and path_rule
blocks. Below is the sample configuration used.
locals {
az_app_gw_health_probes = ["a", "b", "c"]
backend_address_pool = {
"a" = ["X", "Y"]
"b" = ["W", "V"]
}
backend_address_pool_to_use = "a"
}
resource "azurerm_application_gateway" "app_gw" {
name = "test"
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
zones = [1, 2, 3]
sku {
name = "Standard_v2"
tier = "Standard_v2"
}
autoscale_configuration {
min_capacity = 1
max_capacity = 5
}
gateway_ip_configuration {
name = "appGatewayIpConfig"
subnet_id = azurerm_subnet.app_gw_subnet.id
}
frontend_port {
name = "https"
port = 443
}
frontend_ip_configuration {
name = "appGwPublicFrontendIp"
public_ip_address_id = azurerm_public_ip.app_gw_ip.id
}
dynamic "backend_address_pool" {
for_each = local.backend_address_pool
content {
name = backend_address_pool.key
fqdns = backend_address_pool.value
}
}
dynamic "probe" {
for_each = toset(local.az_app_gw_health_probes)
content {
name = probe.key
port = 443
protocol = "Https"
path = format("/%s", probe.key)
match {
body = ""
status_code = ["200"]
}
interval = 60
timeout = 5
unhealthy_threshold = 1
pick_host_name_from_backend_http_settings = true
}
}
dynamic "backend_http_settings" {
for_each = toset(local.az_app_gw_health_probes)
content {
name = backend_http_settings.key
port = 443
protocol = "Https"
probe_name = backend_http_settings.key
pick_host_name_from_backend_address = true
request_timeout = 20
connection_draining {
enabled = true
drain_timeout_sec = 120
}
cookie_based_affinity = "Disabled"
}
}
ssl_policy {
policy_type = "Predefined"
policy_name = "AppGwSslPolicy20170401S"
}
ssl_profile {
name = "TLS_1_2"
ssl_policy {
policy_type = "Predefined"
policy_name = "AppGwSslPolicy20170401S"
}
}
identity {
type = "UserAssigned"
identity_ids = [azurerm_user_assigned_identity.app_gw_user.id]
}
ssl_certificate {
name = "cert"
key_vault_secret_id = azurerm_key_vault_certificate.cert.versionless_secret_id
}
http_listener {
name = "main"
protocol = "Https"
frontend_ip_configuration_name = "appGwPublicFrontendIp"
frontend_port_name = "https"
ssl_certificate_name = "cert"
ssl_profile_name = "TLS_1_2"
}
url_path_map {
name = "mypath"
default_backend_address_pool_name = local.backend_address_pool_to_use
default_backend_http_settings_name = local.az_app_gw_health_probes[0]
dynamic "path_rule" {
for_each = toset(local.az_app_gw_health_probes)
content {
name = path_rule.key
paths = [
format("/%s", path_rule.key)
]
backend_address_pool_name = local.backend_address_pool_to_use
backend_http_settings_name = path_rule.key
}
}
}
request_routing_rule {
name = "mypath"
rule_type = "PathBasedRouting"
http_listener_name = "main"
url_path_map_name = "mypath"
}
}
Adding an element to local.az_app_gw_health_probes
causes the ordering plan diff.
Thanks for that info @owaisaamir! Just to confirm, this is a different problem than the original issue of seeing changes without any changes being made to the config? What you're describing is seeing changes across all the backend_http_settings
when you're just looking to add one new one?
If that's the case, then that's just an unfortunate consequence of moving many of the attributes in application gateway from a List to a Set. We've traded the ordering issues that were causing many problems for a much noisier plan when adding new blocks.
Hey @Nyxbiker, what was your configuration for application gateway and what did you have to do to your config to get it to line up with the portal? My first thought is that we're not generating the Hash for backend_http_settings
correctly so it'd be useful to see which attributes you had to modify to prevent a plan from occurring
Thanks for that info @owaisaamir! Just to confirm, this is a different problem than the original issue of seeing changes without any changes being made to the config? What you're describing is seeing changes across all the
backend_http_settings
when you're just looking to add one new one?If that's the case, then that's just an unfortunate consequence of moving many of the attributes in application gateway from a List to a Sets. We've traded the ordering changes that were causing issues for many for a much noisier plan when adding new blocks.
This is painful when I have a lot (currently 20-25) of backend_http_settings
and probes
that get disturbed. It will add uncertainty on the changes to be done and can create a panic while planning updates without connection failures.
I hate to hear that those changes are causing uncertainty for your configuration but the alternative is for permanent diffs to occur if the ordering is different in the config versus what Azure is returning from the API which was happening often.
This is an issue being tracked by the Terraform proper team but there haven't been any proposed solutions. https://github.com/hashicorp/terraform/issues/28281
Because this is a separate issue from the original, I'm going to mark this conversation as off topic but I encourage you to make an issue on this repo, or better yet, on the Terraform repo so the Core team can try and come up with a solution for you.
I can confirm that the rules are indeed not fixed. I added one of each of the following blocks (in the middle of about 10 configurations, as order should not matter anymore according to #6896), using azurerm 3.0.2 and Terraform 1.1.7. Here are the results:
Type | result |
---|---|
backend_address_pool | ✔ works, only one change detected |
backend_http_settings | ❌ does NOT work, all settings deleted and readded |
http_listener | ❌ does NOT work, all settings deleted and readded |
probe | ✔ works, only one change detected |
redirect_configuration | ❌ does NOT work, all settings deleted and readded |
request_routing_rule | ❌ does NOT work, all settings deleted and readded |
url_path_map | ❌ does NOT work, all settings deleted and readded |
So summa summarum only backend address pools and probes seem to now produce a clean output, everything else still acts like everything is deleted and readded in different order.
If this information helps: We are using dynamic blocks (which seems to be the common use case when applying proxy rules via configuration).
If the issue cannot be tackled using sets - wouldn't the API allow us to create separate resources for routing configuration? In most scenarios rules for one central gateway are added from different, distributed projects anyway (kind of like API management, which usually is managed decentrally too).
We were really relying on the issue being resolved with using sets, as I have multiple customers who are dreading the planned changes of multiple thousands of lines (not exaggerated) when a simple rule in their central AGWs is being added in production.
Hi @johannespetereit, unfortunately, what you're seeing is a separate issue than what is being reported on here where no changes being made to the configuration are causing diffs in Terraform. With that in mind, I'm going to mark this conversation as off-topic.
This issue you're seeing is being tracked on the Terraform proper repo and I encourage you to make noise there https://github.com/hashicorp/terraform/issues/28281 or open a separate issue to track what's going on.
@mbfrahry thanks for your reply. I think I'm grasping the issue, but I also think that many, many customers were waiting for an adertised fix with 3.0. I also realize that another attempt will probably not happen until the next major update of this provider, that is far, far away, which is a bit frustrating. I will ask for the original issue to be opened again. In my view, this will have no way of moving onwards - azurerm is a down stream api to terraform. Azure API is a upstream api to azurerm. I totally understand terraform with this being categorized as a minor optimization for the terraform-team, the azurerm provider is in charge of handling the core logic (getting the current state and providing the planned state, terraform only supplies a diff). In my experience it is not helpful to hope that an upstream api will change on accord of a single downstream provider having difficulties getting their api to comply to the contract, and I don't think this paradigm will shift because of "community preasure" of a single provider. We will therefore start looking into alternatives which are still in the "ARM-days" history of our repos.
Hey @Nyxbiker, what was your configuration for application gateway and what did you have to do to your config to get it to line up with the portal? My first thought is that we're not generating the Hash for
backend_http_settings
correctly so it'd be useful to see which attributes you had to modify to prevent a plan from occurring
Hi @mbfrahry, the config for backend_http_settings is also a dynamic block as below:
`dynamic backend_http_settings { for_each = var.application_gateway_backend_http_settings
content {
name = backend_http_settings.value["name"]
host_name = backend_http_settings.value["host_name"]
cookie_based_affinity = backend_http_settings.value["cookie_based_affinity"]
affinity_cookie_name = backend_http_settings.value["affinity_cookie_name"]
pick_host_name_from_backend_address = backend_http_settings.value["pick_host_name_from_backend_address"]
port = backend_http_settings.value["port"]
protocol = backend_http_settings.value["protocol"]
probe_name = backend_http_settings.value["probe_name"]
path = backend_http_settings.value["path"]
trusted_root_certificate_names = backend_http_settings.value["trusted_root_certificate_names"]
request_timeout = backend_http_settings.value["request_timeout"]
}
}`
And I basically just noticed that in the Azure Portal we had some settings with cookie affinity enabled, so I proceeded to align these properties in the code by changing "cookie_based_affinity" to "Enabled" and "affinity_cookie_name" to the cookie name that was set in the Portal.
I think that this specific issue is related to what you were discussing above with johannespetereit and owaisaamir though, and I agree with them that this is a huge issue because especially in big configurations (we have 51 request routing rules in one app gateway only) it becomes super difficult to figure out what has changed, that would be causing the huge Terraform output for one small property difference.
Coming back to the original issue, I noticed in the output that Terraform adds a property "- priority = 0 -> null" in the request_routing_rule that should be "removed".
So I thought that I should maybe add "property = 0" to the request routing rules since Terraform might see it as a difference (although it's marked as an optional property in the docs) and cause the huge output, but I then get Error: expected request_routing_rule.49.priority to be in the range (1 - 20000), got 0
as an error, so I couldn't test it.
I'm not sure if this is related to the issue that I'm currently experiencing though because I don't have this issue in 1 out of 3 application gateways (that are all using the same parent module with dynamic blocks).
Please note that the initial example is just for one sample routing_rule, but I have this removal and addition issue for all of the request_routing_rules in the affected application gateways. I checked if there were any differences between the code and the portal and I couldn't find any. Also, even after running terraform apply (that should just align everything that isn't) I still have the issue after running terraform plan again, which makes me think that this "priority" property that I see in the output might be causing the issue (but we don't have it set either in the code or the portal). I also tried to ignore the "priority" property in request_routing_rules to see if it would fix the issue, but I can't since lifecycle ignore does not support splat expressions etc.
Hi there. I observe the same issue. There is no way to add a new routing rule, listener, etc. without maintenance and downtime it is very complicated. I tried to use azurerm 2.98.0 and 3.2 versions both have this issue.
Hi all I experience the same issue when updating an app gateway with a new configuration (with Terraform v1.1.9 and azurerm v3.8.0). It would be better to enable new configuration throw a child resource to stick to the parent resource (the gateway) like the way of adding a new certificate or secret to an existing key vault resource. That allows making a separation between terraform projects: the one that maintains the creation of core (shared) resources (like key vault and app gateway), and the others that maintains the creations of web app/VM etc in the backend. that also prevents us using dynamic blocs.
@johannespetereit @Nyxbiker @mbfrahry
Do I understand it correctly that we have just given up on this? What we can do more to move this forward? @johannespetereit - btw, thanks for your time invested in analyzing the problem.
Having the same issue after migration from azurerm 2.x to 3.x
We have dynamic http_listener, request_routing_rule, backend_http_settings block and everytime the application gateway get's a new/random order of these items :/
Please could somebody explain the current state of this issue, as our terraform plans can be 1000's lines when we use AppGW. Should we look for alternatives for the time being if currently unfixable, or is this still being assessed?
Personally, I can’t wait for “coming soon” version. I am under a drastic alternative: I am developing my specific go provider that will allow me to add new settings (listener, backend, rules, cert, etc.) as a separate resource to an existing app gateway (data) defined in a core project with azureRM provider. Actually, it works with http backend. I will add the other settings when I find some free time. I also faced the 429 error (retry later) when calling the azure API several times in parallel.
Update: This is the current version of the provider i have implemented: https://registry.terraform.io/providers/Citeo/azurermagw/0.3.0 I have the same repo on my git, but the last version is in the github of Citeo (were i work actually)
Not really comfortable posting this as a workaround, but maybe it helps someone:
We automated our rules/backends/listeners etc. into variables. Furthermore we use heavy transformation on the properties using locals. (the modules' input variables are high level variables, leaning on the k8s ingress definitions. From these inputs we generate the actual AGW properties in the format AGW consumes them)
With that architecture, not being able to check the end result in the plan is one of our main concerns with this bug.
To get a readable version of the plan, we also push all these changes into a storage table (each property one table row). The Table rows provide a meaningful diff in the plan (this is only a text diff and doesn't show 1000s of changes, but the actual ons, as long as you get the order fixed):
resource "azurerm_storage_table_entity" "agw_settings" {
for_each = { for key, config in {
http_listeners = local.http_listeners
backend_address_pools = local.backend_address_pools
redirect_configurations = local.redirect_configurations
probes = local.probes
backend_http_settings = local.backend_http_settings
request_routing_rules = local.request_routing_rules
url_path_maps = local.url_path_maps
} : key => config }
storage_account_name = data.azurerm_storage_account.config_storage.name
table_name = azurerm_storage_table.rules_table.name
partition_key = "main"
row_key = each.key
entity = {
value = jsonencode(each.value)
}
}
This can at least give you an idea what is changing before you approve your plan.
We actually write the properties into storage for a second reason: we use a null_resource PS Script which performs all the configuration on AGW using the storage table as source (so we moved away from terraform for the routing configuration). Sadly this script is owned by the customer, so I can't post it here. Using a custom script has the advantage that we can have ignore_property on all rules in AGW, so we know that nothing gets messed up there. But I can't really say I recommend building a custom script, as a clean update (=only update real changes) comes with a massive code overhead, and is error prone.
Any updates on this issue?
I implemented a provider to overcome such behaviors. https://registry.terraform.io/providers/Citeo/azurermagw/0.3.0 I am not golang dev expert, but i have done my best as a devops :). Currently, I use it in dev environnent and it works fine. if you can test it and make feedbacks, I’ll be grateful.
Is anyone working on resolving this issue? I am contemplating using PowerShell for plan output parsing here, just to get some reasonable plan results.. Have anyone successfully parsed the terraform plan output to calculate the actual changes? Might save some time if someone could share a script I can start with :-)
This is a huge issue and I am surprised it doesn't get more attention. Either this is not a priority to fix, or it is nearly impossible to fix.
For now I will try to parse the plan output, and maybe find some workarounds myself.
@torivara I went through same phase - got suprised about it. Tried to be active and understand if this is real state or I just missed the point. And the outcome was - it is real state. Still.
People are basically doing workarounds off the terraform. This is too critical resource to relay on such messy plans in production.
Peter
@mbfrahry how can we approach a fix for the time being? Currently this resource is unusable because it deletes all request_routing_rule
blocks to create them afterwards in a non atomic operation leading to a downtime of the AGW connected services.
@mbfrahry how can we approach a fix for the time being? Currently this resource is unusable because it deletes all
request_routing_rule
blocks to create them afterwards in a non atomic operation leading to a downtime of the AGW connected services.
sure, that leads to downtime? I observe a lot of changes in the plan, but the Azure API seems to handle that through a "update" and not a "drop and create". Did you observe different behaviour?
Looking at the REST API, this is the only way it could work. There is only one create or update method, so you can't delete request routing rules in isolation. azurerm just PUTs the updated complete configuration of the instance at the API and azure acts accordingly.
Hi all, facing issues with application gateway, after importing the application gateway, request_routing_rule, http_listeners and backend_http_setting are getting recreated, is it related to this issue or should I create another ticket for that? terraform version: 1.1.1 azurerm provider version: v3.35.0
Not sure if it helps anyone, but I had this issue and I discovered that after a recent manual certificate update, tf suddenly started caring if a priority was set wrt request routing rules. As soon as I set the priority, my request routing rules were sorted out.
Terraform v0.12.31
- provider.azurerm v2.99.0
Had comparable issues. After applying, a new plan indicated recreation of the request_routing_rule, and had no idea why. Updated to latest terraform and azurerm provider. To no avail (but is always a good start :) )
After hours of debugging, i looked at the statefile, and found that the 'url_path_map_name' is empty, while on every apply i set it (and came back as succesfull change).
But when you look at the state file, no name was set. Cleared the name -> (url_path_map_name = ""), and ran a plan / apply. Only for a single change to be apply, no more re-applying that block.
Hope this is helpfull to anyone.
Had the same issue with a recreation of request_routing_rule in the application gateway. We upgraded the provider to the newest version and added priority to every request_routing_rule, after that application gateway is not changing.
I have the same happening in http_listener. Has anyone faced this?