terraform-provider-azurerm
terraform-provider-azurerm copied to clipboard
HTTP response was nil; connection may have been reset
Is there an existing issue for this?
- [X] I have searched the existing issues
Community Note
- Please vote on this issue by adding a :thumbsup: reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment and review the contribution guide to help.
Terraform Version
1.6.6
AzureRM Provider Version
3.86.0
Affected Resource(s)/Data Source(s)
azurerm_batch_account
Terraform Configuration Files
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "3.86.0"
}
}
}
provider "azurerm" {
features {}
}
locals {
name = "batch812834"
}
### Group ###
resource "azurerm_resource_group" "default" {
name = "rg-batch"
location = "eastus2"
}
### Storages ###
resource "azurerm_storage_account" "autostorage" {
name = "st${local.name}autostg"
resource_group_name = azurerm_resource_group.default.name
location = azurerm_resource_group.default.location
account_tier = "Standard"
account_replication_type = "LRS"
}
### Batch Account ###
resource "azurerm_batch_account" "default" {
name = local.name
resource_group_name = azurerm_resource_group.default.name
location = azurerm_resource_group.default.location
public_network_access_enabled = true
storage_account_id = azurerm_storage_account.autostorage.id
storage_account_authentication_mode = "BatchAccountManagedIdentity"
pool_allocation_mode = "BatchService"
identity {
type = "SystemAssigned"
}
network_profile {
account_access {
default_action = "Allow"
}
node_management_access {
default_action = "Allow"
}
}
# Uncomment to trigger the error after the batch account is created
# tags = {
# environment = "dev"
# }
}
Debug Output/Panic Output
When I turn on `TF_LOG=DEBUG` the error is not happening.
Problem started on `3.86.0`. Version `3.85.0` is working fine.
Expected Behaviour
Changes to the Batch resource should be performed correctly.
Problem started on 3.86.0. Version 3.85.0 is working fine.
Actual Behaviour
When changing the Batch account resource, such as adding a tag, or adding a pool with Terraform, the program fails.
Problem started on 3.86.0. Version 3.85.0 is working fine.
╷
│ Error: updating Batch Account (Subscription: "00000000-0000-0000-0000-000000000000"
│ Resource Group Name: "rg-batch"
│ Batch Account Name: "batch812834"): Patch "https://management.azure.com/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rg-batch/providers/Microsoft.Batch/batchAccounts/batch812834?api-version=2023-05-01": HTTP response was nil; connection may have been reset
│
│ with azurerm_batch_account.default,
│ on main.tf line 48, in resource "azurerm_batch_account" "default":
│ 48: resource "azurerm_batch_account" "default" {
Steps to Reproduce
terraform apply- Change the Batch resource code (exemple, add a tag, or add a Pool)
terraform apply
Important Factoids
No response
References
No response
Hi @epomatti , thanks for raising this issue. I am trying to repoduce this in my local machine. It seems ok by my experiment with your template above. Here is the output result.
terraform apply -auto-approve
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# azurerm_batch_account.default will be created
+ resource "azurerm_batch_account" "default" {
+ account_endpoint = (known after apply)
+ allowed_authentication_modes = (known after apply)
+ id = (known after apply)
+ location = "eastus2"
+ name = "batch812834"
+ pool_allocation_mode = "BatchService"
+ primary_access_key = (sensitive value)
+ public_network_access_enabled = true
+ resource_group_name = "yunliughissuebatch"
+ secondary_access_key = (sensitive value)
+ storage_account_authentication_mode = "BatchAccountManagedIdentity"
+ storage_account_id = (known after apply)
+ identity {
+ principal_id = (known after apply)
+ tenant_id = (known after apply)
+ type = "SystemAssigned"
}
+ network_profile {
+ account_access {
+ default_action = "Allow"
}
+ node_management_access {
+ default_action = "Allow"
}
}
}
# azurerm_resource_group.default will be created
+ resource "azurerm_resource_group" "default" {
+ id = (known after apply)
+ location = "eastus2"
+ name = "yunliughissuebatch"
}
# azurerm_storage_account.autostorage will be created
+ resource "azurerm_storage_account" "autostorage" {
+ access_tier = (known after apply)
+ account_kind = "StorageV2"
+ account_replication_type = "LRS"
+ account_tier = "Standard"
+ allow_nested_items_to_be_public = true
+ cross_tenant_replication_enabled = true
+ default_to_oauth_authentication = false
+ enable_https_traffic_only = true
+ id = (known after apply)
+ infrastructure_encryption_enabled = false
+ is_hns_enabled = false
+ large_file_share_enabled = (known after apply)
+ location = "eastus2"
+ min_tls_version = "TLS1_2"
+ name = "stbatch812834autostg"
+ nfsv3_enabled = false
+ primary_access_key = (sensitive value)
+ primary_blob_connection_string = (sensitive value)
+ primary_blob_endpoint = (known after apply)
+ primary_blob_host = (known after apply)
+ primary_blob_internet_endpoint = (known after apply)
+ primary_blob_internet_host = (known after apply)
+ primary_blob_microsoft_endpoint = (known after apply)
+ primary_blob_microsoft_host = (known after apply)
+ primary_connection_string = (sensitive value)
+ primary_dfs_endpoint = (known after apply)
+ primary_dfs_host = (known after apply)
+ primary_dfs_internet_endpoint = (known after apply)
+ primary_dfs_internet_host = (known after apply)
+ primary_dfs_microsoft_endpoint = (known after apply)
+ primary_dfs_microsoft_host = (known after apply)
+ primary_file_endpoint = (known after apply)
+ primary_file_host = (known after apply)
+ primary_file_internet_endpoint = (known after apply)
+ primary_file_internet_host = (known after apply)
+ primary_file_microsoft_endpoint = (known after apply)
+ primary_file_microsoft_host = (known after apply)
+ primary_location = (known after apply)
+ primary_queue_endpoint = (known after apply)
+ primary_queue_host = (known after apply)
+ primary_queue_microsoft_endpoint = (known after apply)
+ primary_queue_microsoft_host = (known after apply)
+ primary_table_endpoint = (known after apply)
+ primary_table_host = (known after apply)
+ primary_table_microsoft_endpoint = (known after apply)
+ primary_table_microsoft_host = (known after apply)
+ primary_web_endpoint = (known after apply)
+ primary_web_host = (known after apply)
+ primary_web_internet_endpoint = (known after apply)
+ primary_web_internet_host = (known after apply)
+ primary_web_microsoft_endpoint = (known after apply)
+ primary_web_microsoft_host = (known after apply)
+ public_network_access_enabled = true
+ queue_encryption_key_type = "Service"
+ resource_group_name = "yunliughissuebatch"
+ secondary_access_key = (sensitive value)
+ secondary_blob_connection_string = (sensitive value)
+ secondary_blob_endpoint = (known after apply)
+ secondary_blob_host = (known after apply)
+ secondary_blob_internet_endpoint = (known after apply)
+ secondary_blob_internet_host = (known after apply)
+ secondary_blob_microsoft_endpoint = (known after apply)
+ secondary_blob_microsoft_host = (known after apply)
+ secondary_connection_string = (sensitive value)
+ secondary_dfs_endpoint = (known after apply)
+ secondary_dfs_host = (known after apply)
+ secondary_dfs_internet_endpoint = (known after apply)
+ secondary_dfs_internet_host = (known after apply)
+ secondary_dfs_microsoft_endpoint = (known after apply)
+ secondary_dfs_microsoft_host = (known after apply)
+ secondary_file_endpoint = (known after apply)
+ secondary_file_host = (known after apply)
+ secondary_file_internet_endpoint = (known after apply)
+ secondary_file_internet_host = (known after apply)
+ secondary_file_microsoft_endpoint = (known after apply)
+ secondary_file_microsoft_host = (known after apply)
+ secondary_location = (known after apply)
+ secondary_queue_endpoint = (known after apply)
+ secondary_queue_host = (known after apply)
+ secondary_queue_microsoft_endpoint = (known after apply)
+ secondary_queue_microsoft_host = (known after apply)
+ secondary_table_endpoint = (known after apply)
+ secondary_table_host = (known after apply)
+ secondary_table_microsoft_endpoint = (known after apply)
+ secondary_table_microsoft_host = (known after apply)
+ secondary_web_endpoint = (known after apply)
+ secondary_web_host = (known after apply)
+ secondary_web_internet_endpoint = (known after apply)
+ secondary_web_internet_host = (known after apply)
+ secondary_web_microsoft_endpoint = (known after apply)
+ secondary_web_microsoft_host = (known after apply)
+ sftp_enabled = false
+ shared_access_key_enabled = true
+ table_encryption_key_type = "Service"
}
Plan: 3 to add, 0 to change, 0 to destroy.
azurerm_resource_group.default: Creating...
azurerm_resource_group.default: Creation complete after 4s [id=/subscriptions/<subid>/resourceGroups/yunliughissuebatch]
azurerm_storage_account.autostorage: Creating...
azurerm_storage_account.autostorage: Still creating... [10s elapsed]
azurerm_storage_account.autostorage: Still creating... [20s elapsed]
azurerm_storage_account.autostorage: Creation complete after 28s [id=/subscriptions/<subid>/resourceGroups/yunliughissuebatch/providers/Microsoft.Storage/storageAccounts/stbatch812834autostg]
azurerm_batch_account.default: Creating...
azurerm_batch_account.default: Still creating... [10s elapsed]
azurerm_batch_account.default: Still creating... [20s elapsed]
azurerm_batch_account.default: Creation complete after 21s [id=/subscriptions/<subid>/resourceGroups/yunliughissuebatch/providers/Microsoft.Batch/batchAccounts/batch812834]
Apply complete! Resources: 3 added, 0 changed, 0 destroyed.
PS C:\Users\yunliu1\Documents\terraformtest\batch3>
PS C:\Users\yunliu1\Documents\terraformtest\batch3> terraform apply -auto-approve
azurerm_resource_group.default: Refreshing state... [id=/subscriptions/<subid>/resourceGroups/yunliughissuebatch]
azurerm_storage_account.autostorage: Refreshing state... [id=/subscriptions/<subid>/resourceGroups/yunliughissuebatch/providers/Microsoft.Storage/storageAccounts/stbatch812834autostg]
azurerm_batch_account.default: Refreshing state... [id=/subscriptions/<subid>/resourceGroups/yunliughissuebatch/providers/Microsoft.Batch/batchAccounts/batch812834]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
# azurerm_batch_account.default will be updated in-place
~ resource "azurerm_batch_account" "default" {
id = "/subscriptions/<subid>/resourceGroups/yunliughissuebatch/providers/Microsoft.Batch/batchAccounts/batch812834"
name = "batch812834"
~ tags = {
+ "environment" = "dev"
}
# (11 unchanged attributes hidden)
# (2 unchanged blocks hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.
azurerm_batch_account.default: Modifying... [id=/subscriptions/<subid>/resourceGroups/yunliughissuebatch/providers/Microsoft.Batch/batchAccounts/batch812834]
azurerm_batch_account.default: Modifications complete after 6s [id=/subscriptions/<subid>/resourceGroups/yunliughissuebatch/providers/Microsoft.Batch/batchAccounts/batch812834]
Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
Not sure if related, but I'm running on WSL2.
I'll see if I notice any other pattern.
I am using azurerm 3.74.0, and on WSL through Docker container. Code that's been working for is getting "HTTP response was nil; connection may have been reset"
Terraform 1.4.2 and azurerm 3.95.0 gives me:
Planning failed. Terraform encountered an error while generating this plan.
╷
│ Error: reading Connection String information for Linux App Service (Subscription: "$ARM_SUBSCRIPTION_ID"
│ Resource Group Name: "rg-x"
│ Site Name: "func-x"): Post "https://management.azure.com/subscriptions/$ARM_SUBSCRIPTION_ID/resourceGroups/rg-x/providers/Microsoft.Web/sites/func-x/config/connectionStrings/list?api-version=2023-01-01": HTTP response was nil; connection may have been reset
│
│ with azurerm_linux_function_app.functionapp-x,
│ on [functionapp-x.tf](http://functionapp-x.tf/) line 5, in resource "azurerm_linux_function_app" "functionapp-x":
│ 5: resource "azurerm_linux_function_app" "functionapp-x" {
│
│ reading Connection String information for Linux App Service (Subscription:
│ "$ARM_SUBSCRIPTION_ID"
│ Resource Group Name: "rg-x"
│ Site Name: "func-x"): Post
│ "https://management.azure.com/subscriptions/$ARM_SUBSCRIPTION_ID/resourceGroups/rg-x/providers/Microsoft.Web/sites/func-x/config/connectionStrings/list?api-version=2023-01-01":
│ HTTP response was nil; connection may have been reset
╵
This is on terraform apply -auto-approve. Before this, terraform plan succeeded with "No changes. Your infrastructure matches the configuration."
A previous run of this pipeline failed on the apply step, because it tried to remove Purge Protection from a key vault. I changed this in the terraform files, and ran the pipeline again, resulting in this error. It looks like other changes have been applied to Azure in the first run that failed. The previous (succesful) run was with azurerm 3.34.0
EDIT: I tried rerunning the failed pipeline step, and now it succeeded with "No changes. Your infrastructure matches the configuration." Can this be due to some temporary connection error, or some rate limiting in Azure, or A/B testing by Microsoft? I'm using Bitbucket for CI/CD.
same here issue on 1.6.0 using terraform cloud backend - but is very random as to when it happens - normally mid apply
I am also frequently experiencing this with all kinds of resources. Yesterday I got into a situation where terraform apply created a new resource, but due to this error, the state was not updated to show the new resource.
Error message:
│ Error: creating Windows Slot (Subscription: "83731164-2cea-4291-b78d-7e2e69eea8a6"
│ Resource Group Name: ....
│ Site Name: ....
│ Slot Name: "deployment"): performing CreateOrUpdateSlot: Put "https://management.azure.com/subscriptions/..../resourceGroups/..../providers/Microsoft.Web/sites/..../slots/deployment?api-version=2023-01-01": HTTP response was nil; connection may have been reset
│
│ with azurerm_windows_function_app_slot.slot,
│ on function-app-slot.tf line 7, in resource "azurerm_windows_function_app_slot" "slot":
│ 7: resource "azurerm_windows_function_app_slot" "slot" {
│
│ creating Windows Slot (Subscription: ....
│ Resource Group Name: ....
│ Site Name: ....
│ Slot Name: "deployment"): performing CreateOrUpdateSlot: Put
│ "https://management.azure.com/subscriptions/..../resourceGroups/..../providers/Microsoft.Web/sites/..../slots/deployment?api-version=2023-01-01":
│ HTTP response was nil; connection may have been reset``
This meant that running terraform plan again caused the same change to show and running terraform apply gave the following error:
│ Error: A resource with the ID "/subscriptions/..../resourceGroups/..../providers/Microsoft.Web/sites/..../slots/deployment" already exists - to be managed via Terraform this resource needs to be imported into the State. Please see the resource documentation for "azurerm_windows_function_app_slot" for more information.
│
│ with azurerm_windows_function_app_slot.slot,
│ on [function-app-slot.tf] line 7, in resource "azurerm_windows_function_app_slot" "slot":
│ 7: resource "azurerm_windows_function_app_slot" "slot" {
│
│ A resource with the ID
│ "/subscriptions/..../resourceGroups/..../providers/Microsoft.Web/sites/..../slots/deployment"
│ already exists - to be managed via Terraform this resource needs to be
│ imported into the State. Please see the resource documentation for
│ "azurerm_windows_function_app_slot" for more information.
╵
This means we often have to run terraform multiple times and this seemingly random error can cause our production environment to end up in an inconsistent state.
Does anyone know if this is a terraform issue or something on Azure's side?
We also run into this issue when running terraform from within azure (via AVD desktop or self-hosted ADO build agents).
Running with -parallelism=5 or lower seems to reduce the occurrence of it.
We see slow responses during plan when terraform is refreshing windows_web_app state specifically.
We have hundreds of webapps in state so have started to separate them out into separate states so that plans are more successful.
We aren't sure if the connection issue occurs because of too many calls to management.azure.com or some kind of rate limiting somewhere along the way.
It would be nice if terraform could gracefully recover from these "HTTP response was nil; connection may have been reset" errors and try again.
Terraform v1.8.0 on windows_amd64 provider registry.terraform.io/hashicorp/azurerm v3.97.1
Recently we see the error during calls to
- https://management.azure.com/subscriptions/.../resourceGroups/.../providers/Microsoft.Web/sites/.../config/appSettings/list?api-version=2023-01-01
- https://management.azure.com/subscriptions/.../resourceGroups/.../providers/Microsoft.Web/sites/.../config/backup/list?api-version=2023-01-01
│ Error: reading App Settings for Windows App Service (Subscription: "..." │ Resource Group Name: "..." │ Site Name: "..."): Post "https://management.azure.com/subscriptions/.../resourceGroups/.../providers/Microsoft.Web/sites/.../config/appSettings/list?api-version=2023-01-01": HTTP response was nil; connection may have been reset │ │ with module.windows_web_apps.module.windows_web_apps["..."].azurerm_windows_web_app.windows_web_app, │ on .terraform\modules\windows_web_apps.windows_web_apps\windows_web_app.tf line 56, in resource "azurerm_windows_web_app" "windows_web_app": │ 56: resource "azurerm_windows_web_app" "windows_web_app" { │ │ reading App Settings for Windows App Service (Subscription: "..." │ Resource Group Name: "..." │ Site Name: "..."): Post │ "https://management.azure.com/subscriptions/.../resourceGroups/.../providers/Microsoft.Web/sites/.../config/appSettings/list?api-version=2023-01-01": │ HTTP response was nil; connection may have been reset
.... │ Site Name: "..."): Post │ "https://management.azure.com/subscriptions/.../resourceGroups/.../providers/Microsoft.Web/sites/.../config/backup/list?api-version=2023-01-01": │ HTTP response was nil; connection may have been reset
I am also experiencing this behavior on azurerm_storage_account when running a large config that interacts with a large number of storageaccounts. Running from ADO pipeline agent hosted on Azure Container Instance (Windows) using provider hashicorp/azurerm v3.100.0, also experiencing this on 3.99.
│ Error: building Accounts Data Plane Client: retrieving Storage Account Key: listing Keys for Storage Account (Subscription: "2ccab7fe-ac67-49ce-974e-11a8da78fb5f"
│ Resource Group Name: "xxxxx"
│ Storage Account Name: "xxxxx"): Post "[https://management.azure.com/subscriptions/xxxxx/resourceGroups/xxxxx/providers/Microsoft.Storage/storageAccounts/xxxxx/listKeys?%24expand=kerb&api-version=2023-01-01"](https://management.azure.com/subscriptions/xxxxx/resourceGroups/xxxxx/providers/Microsoft.Storage/storageAccounts/xxxxx/listKeys?%24expand=kerb&api-version=2023-01-01%22): HTTP response was nil; connection may have been reset
│
│ with module.hyper_v_arcservers_rg.module.hyper_v_host_resources["hv27"].azurerm_storage_account.xxxxx,
│ on modules\arc-hyper-v-host-resources\storageaccount.tf line 1, in resource "azurerm_storage_account" "arc_hyper_v_host":
│ 1: resource "azurerm_storage_account" "arc_hyper_v_host" {
Same behavior here on terraform 1.5.4, azurerm 3.100.0.
Resource azurerm_redis_cache_access_policy_assignment.
Even on destroy:
Error: deleting Redis Cache Access Policy Assignment access-policy-assignment-redistst5-deployer in Redis Cache redis-kr-tst-snd-eus2-ff-redistst5 in resource group rg-kr-tst-snd-eus2-ff-landing-zone-test-redis5: Delete "https://management.azure.com/subscriptions/375658e5-ff59-44d2-9f7e-60c9b176c583/resourceGroups/rg-kr-tst-snd-eus2-ff-landing-zone-test-redis5/providers/Microsoft.Cache/redis/redis-kr-tst-snd-eus2-ff-redistst5/accessPolicyAssignments/access-policy-assignment-redistst5-deployer?api-version=2023-08-01": HTTP response was nil; connection may have been reset
Same behaviour is occurring on terraform 1.3.6 / azurerm 3.94.0.
The issue seems to be intermittent, and only appears to be happening for App Service as far as I can tell. The issue is happening during plan, apply or destroy actions and only for our large state files where we have 1,000+ resources. Where we have small state files with less 100 resources we are not seeing the issue.
Previously we were using terraform 1.3.6 / azurerm 3.53.0 and we did not see the issue before upgrading.
Having the same issue, on azurerm 3.94.0
same here
same for me. On azurerm 3.94.0 running on azure pipeline
Same thing happens on latest azurerm version(3.104.2) for various resources, on multiple environments and pipelines. Do we have any workaround or update as to what is causing this issue?
@dimitrijap try 3.85.0 version of the provider, was working fine. Problem started on 3.86.0, at least for me.
Same problem when trying to destroy resources linked to Azure Postgres Flexible Server
(specifically azurerm_postgresql_flexible_server_active_directory_administrator and azurerm_postgresql_flexible_server_database).
I am using the azurerm provider version 3.71.0, but faced the exact same problem with the latest 3.105.0 version.
The problem seems to happen randomly on my end, and I only faced it during destroy processes so far.
I also tried the option -parallelism=1 but it didn't change anything.
Same probem with versions below, problem happen randomely and seems to come from Azure. Terraform manage web app and function apps.
Terraform v1.8.4 on darwin_arm64
- provider registry.terraform.io/cloudflare/cloudflare v4.29.0
- provider registry.terraform.io/hashicorp/azurerm v3.109.0
- provider registry.terraform.io/hashicorp/time v0.11.2
Having the same issue randomly, on azurerm 3.90.0
Happens to me using Terraform v1.9.2 and azurerm v3.112.0 on WSL all the time. Does not happen on Windows at all (same versions). Must be something network-related.
No issue using Terraform v1.9.5 and azurerm v4.1.0 and Azure DevOps Microsoft-hosted agents ubuntu. I had the issue when I switch the agent to Self-Hosted windows.
As @chouse-qumodity, with -parallelism=5 seems to reduce the occurrence of it.
We use version 3.97.1 of azurerm, and we are still facing this issue
We're experiencing this issue as well. Terraform v1.9.7 and azurerm v4.4 Code manages around 250+ resources, a lot of storage accounts are included. The strange thing is I can execute it fine from Ubuntu in Windows WSL but it fails to run on Github Ubuntu runners.
What I've tried so far was
- different TF versions
- different azurerm provider versions
- different GH runner sizes
- different TF parallelism settings (tried even to go as low as 2)
**m╷
**m│ **mError: listing Keys for Storage Account (Subscription: "***"
**m│ Resource Group Name: "rg-******"
**m│ Storage Account Name: "sfmckabrazilsouth"): Post "https://management.azure.com/subscriptions/***/resourceGroups/*****/providers/Microsoft.Storage/storageAccounts/********/listKeys?%24expand=kerb&api-version=2023-01-01": HTTP response was nil; connection may have been reset
**m│
**m│ with azurerm_storage_account.function_storage[22],
**m│ on *****.tf line 64, in resource "azurerm_storage_account" "******":
**m│ 64: resource "azurerm_storage_account" "function_storage" {
**m│
the same for terraform creating 4 resources
Terraform v1.9.7
on darwin_arm64
+ provider registry.terraform.io/hashicorp/azurerm v4.5.0