terraform icon indicating copy to clipboard operation
terraform copied to clipboard

feature request: parallelism parameter for resources with count

Open dkiser opened this issue 7 years ago • 38 comments

As a Terraform User, I should be able to specify a parallelism count on resource instances containing a count parameter So That I Can handle creating API resources via providers at a more sane rate and/or deal with mediocre api backends.

Example Terraform Configuration File

resource "providera_resourcea" "many" {
   count = 1000
   parallelism = 10

   attributeA = "a"
   attributeB = "b"
   attributeC = "c"
}

Expected Behavior

GIVEN terraform apply -parallelism = X where X < ${providera_resourcea.many.*.parallelism) WHEN terraform creates/deletes/refreshes resources THEN I expect only X concurrent goroutines creating this resource type.

GIVEN terraform apply -parallelism = X where X >= ${providera_resourcea.many.*.parallelism) WHEN terraform creates/deletes/refreshes resources THEN I expect only ${providera_resourcea.many.*.parallelism) concurrent goroutines creating this resource type.

dkiser avatar May 05 '17 22:05 dkiser

Possibly related to #7388

dkiser avatar May 05 '17 22:05 dkiser

+1

kwach avatar Jul 03 '17 10:07 kwach

+1

Stono avatar Jan 05 '18 21:01 Stono

I just try to create 3 vsphere_virtual_machine resource.
Because all 3 virtual machines are created at the same time, they take exponentially more time to create, causing the apply operation to time out.

Creating 1 machine take 5 minutes. All 3 machines therefor take 15 minutes with a single thread. Creating 3 machines timeout after 10 minutes because each machine are now taking longer than 5 minutes each and are creating disk, balancing disk and reconfiguring bottleneck on the vsphere server.

But other resources are working fine. So, I should be able to limit the number of simultaneously job running into either the resource, or the provider (or both)

pbusquemdf avatar Jul 05 '19 21:07 pbusquemdf

Another use case I would see for that is rolling update of immutable infrastructure, where you just roll the update one server/resource at a time.

invidian avatar Aug 08 '19 19:08 invidian

I found this issue when trying to run apply against "pass" provider which needs to communicate with git repository and this should be done one by one, but my infrastructure covers many other resources types (from different providers) so I'd like to run it with high parallelism but limited to 1 only for resources of certain type (or provider as mnetioned @invidian )

mkjmdski avatar Oct 15 '19 17:10 mkjmdski

I'm hitting this today. There is still no known workaround, I take it?

Anecdotally, what I'm trying to do is create many managed instance groups in GCP that are all backends for the same load-balancer (using count) but can't be collapsed into one because of upstream constraints and how we're partitioning outbound traffic. Doing so forfeits rolling update semantics, of course.

I started to hack at having each instance group depend on the prior one after the head of the list until I realized that depends_on is static and the entire resource group is actually a single node in the DAG. Any ideas? As it stands, my only real strategy is to move this stuff out into a dedicated repo run with -parallelism=1 and use the remote state provider to loosely couple back to our primary repository :(

davidquarles avatar May 04 '20 17:05 davidquarles

I'm getting an error: Deleting CloudWatch Events permission failed: ConcurrentModificationException: EventBus default was modified concurrently

I believe this suggestion would let me work around this issue by applying a parellism limit on permissions affecting the default event bus on the account.

ie. adding a parellism attribute to this resource: resource "aws_cloudwatch_event_permission" "PerAccountAccess" { for_each = local.accountslist

brendan-sherrin avatar May 12 '20 02:05 brendan-sherrin

I've found a similar issue with multiple Google SQL databases on a private IP where this would be incredibly useful (detailed on SO.

mrsimonemms avatar Jun 14 '20 19:06 mrsimonemms

👍 Have this issue with a custom redshift provider. Need to limit the number of concurrent requests being made.

delwaterman avatar Aug 05 '20 19:08 delwaterman

Same here with AWS task definitions within the same family.

hege-aliz avatar Dec 04 '20 00:12 hege-aliz

Similar issue with Azure dns and Public IP : I want to create severals A record for the same public IP

resource "azurerm_dns_a_record" "new" {
  count               = length(var.subdomains)
  name                = coalesce(var.subdomains[count.index])
  zone_name           = "var.zone"
  resource_group_name = "var.dns_rgname"
  ttl                 = 60
  target_resource_id  = azurerm_public_ip.public_ip.id

  depends_on          = [azurerm_public_ip.public_ip]
}

have an issue with terraform apply :

dns.RecordSetsClient#CreateOrUpdate: Failure responding to request: StatusCode=409 -- Original Error: autorest/azure: Service returned an error. Status=409 Code="Conflict" 

No issue with terraform apply --parallelism=1

antanof avatar Dec 06 '20 14:12 antanof

Like many others, I'm face the same kind of issue for some resources with google cloud. (network peering, firestore indexes, ...)

clarsonneur avatar Jan 14 '21 16:01 clarsonneur

I've also run into this when making changes to load balancers and target groups. Certain changes destroy everything before recreating anything. I'd like to be able to group the changes so that only some of them are done at a time. Alternatively, changes to the lifecycle sequencing would be as useful.

In this case, we aren't using count but for_each. I don't think that should make a difference for limiting parallelism.

tshawker avatar Feb 17 '21 20:02 tshawker

We are experiencing issues when attempting to bootstrap Chef using a null resource, or, when using Chef provisioner and building servers, either with VMWare or Azure. There are issues with vault permissions being assigned properly, to the node in Chef server. This succeeds when we set parallelism to 1, but fails intermittently, but fairly consistently, when not set to 1. It would be nice to only set the null resource for the bootstrap to a parallelism to 1, but everything else, to be allowed to be ran in parallel.

cbus-guy avatar May 13 '21 14:05 cbus-guy

+1. # 2021-07-16

zhujinhe avatar Jul 16 '21 02:07 zhujinhe

I'm surprised there is no news on this. I have only one resource that requires parallelism to be 1 but the only native solution is to disable parallelism for the entire infrastructure (of many many objects) using terraform apply --parallelism=1. I'll love the see this feature suffice for resources with for_each or count.

iyinoluwaayoola avatar Aug 12 '21 06:08 iyinoluwaayoola

+1 for #currentyear.

even being able to set parallelism on a module level would be great

mhaddon avatar Apr 19 '22 19:04 mhaddon

Need this very much. For me the count parameter is creating subnets in same vlan. Wish i could control count parallelism. I need it to create one after the other.

surajsbharadwaj avatar May 24 '22 15:05 surajsbharadwaj

+1 here, I can easily accomplish this using batchsize on any copy loop with arm templates, suprised this isn't ready after almost 5 years

guidooliveira avatar Jun 10 '22 16:06 guidooliveira

Very much needed feature. I need to run a module using for each, but the system runs out of space in some situations. If I had the possibility to limit the parallelism, I could easily manage this.

mukundjalan avatar Jun 13 '22 11:06 mukundjalan

I have the same need. My main issue is when managing subnets in Azure within the same VNET, Azure doesn't allow to modify multiple subnets at the same time. My only workaround is using a null_resourcewith az clicommands, which is a very sucky way of working. If I had a way setting the parallelism for this, I can reduce it to 1 and have everything managed by terraform code.

AresiusXP avatar Jun 16 '22 07:06 AresiusXP

+1

viniciuscastro-hotmart avatar Jul 20 '22 16:07 viniciuscastro-hotmart

First of all , i like hashicorp products a lot, they really helped me a lot. ❤️ @armon @mitchellh

And I know as a non-paying user I don't have any right to ask you to do anything, but feature requests with core tag have been opened for over 5 years.

I love Terraform and Nomad, but they occasionally give me a one-last-mile-needed feeling to production ready. Other request like https://github.com/hashicorp/nomad/issues/1635 have been opened for over 6 years.

I really hope that you lovely developers of Hashicorp have time and willing to make a plan of the old core feature requests when developing new features.

zhujinhe avatar Jul 22 '22 08:07 zhujinhe

Hi folks,

I am too struggling with this here missing! I have some resources and modules that can't handle parallelism while other work perfectly fine with it.

Setting the whole apply with the parallelism flag to 1 is really excruciatingly slow!! Can we have this one somehow prioritized for the sake of all those having similar issues?!

sherifkayad avatar Nov 08 '22 13:11 sherifkayad

Same issue when having hundreds of monitored projects in google: https://github.com/hashicorp/terraform-provider-google/issues/12883

TF also doesn't seem to handle 429's very well and completely screws up plan/apply steps when heavily rate limited.

danjamesmay avatar Dec 01 '22 14:12 danjamesmay

We are having the same issue in Azure with multiple PrivateEndpoints into a Subnet. PE creation also fails while performing VNet peering. @AresiusXP can you explain the null_resource in detail, are you validating the subnet is ready or controlling parallism? Comparable with issues in terraform-provider-azurerm: #21293 #16182

abij avatar Apr 06 '23 17:04 abij

We are having the same issue in Azure with multiple PrivateEndpoints into a Subnet. PE creation also fails while performing VNet peering. @AresiusXP can you explain the null_resource in detail, are you validating the subnet is ready or controlling parallism? Comparable with issues in terraform-provider-azurerm: #21293 #16182

We have 2 null_resource that have a depends_on the vnet with subnets resource. Once it's done, it's running an az cli command on each subnet to add private endpoints, and to associate to a route_table.

  triggers = {
    subnets = join(" ", azurerm_virtual_network.vnet.subnet.*.name)
  }

  provisioner "local-exec" {
    command     = "az login --service-principal -u $ARM_CLIENT_ID -p $ARM_CLIENT_SECRET --tenant $ARM_TENANT_ID; az account set --subscription ${var.global_settings.subscription_id}; for subnet in ${self.triggers.subnets}; do az network vnet subnet update -g ${var.rg_name} -n $subnet --vnet-name ${var.vnet_name} --service-endpoints ${join(" ", local.service_endpoints)}; done"
    interpreter = ["/bin/bash", "-c"]
  }

  depends_on = [
    azurerm_virtual_network.vnet
  ]
}

resource "null_resource" "rt" {
  count = var.rt_enabled ? 1 : 0

  triggers = {
    subnets = join(" ", azurerm_virtual_network.vnet.subnet.*.name)
  }

  provisioner "local-exec" {
    command     = "az login --service-principal -u $ARM_CLIENT_ID -p $ARM_CLIENT_SECRET --tenant $ARM_TENANT_ID; az account set --subscription ${var.global_settings.subscription_id}; for subnet in ${self.triggers.subnets}; do az network vnet subnet update -g ${var.rg_name} -n $subnet --vnet-name ${var.vnet_name} --route-table ${var.route_table_id}; done"
    interpreter = ["/bin/bash", "-c"]
  }

  depends_on = [
    azurerm_virtual_network.vnet,
    null_resource.endpoints
  ]
}

AresiusXP avatar Apr 06 '23 19:04 AresiusXP

My use-case for this would be building multiple Docker images (using the registry.terraform.io/providers/kreuzwerker/docker provider) from a single (parametrized) Dockerfile in a for_each loop. I'd like to reduce the parallelism of the docker_image resource (and only of that resource) to 1 because my Dockerfile installs Python packages, and having N versions of this installation concurrently can result in sporadic issues because pip presently does not have any synchronization mechanisms around its cache directory (which is shared between Docker processes) click.

pspot2 avatar May 06 '23 01:05 pspot2

Having same issue on powerdns with SQLITE backend because of concurrent access to database.

M0NsTeRRR avatar Jun 16 '23 22:06 M0NsTeRRR