terraform-provider-ibm
terraform-provider-ibm copied to clipboard
ibm_container_vpc_cluster never becomes ready then is tainted
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Terraform CLI and Terraform IBM Provider Version
Affected Resource(s)
- ibm_container_vpc_cluster
Bug
Final output of run 1 (3 hours)
module.vpc_ocp_cluster.ibm_container_vpc_cluster.cluster: Still creating... [2h59m57s elapsed]
module.vpc_ocp_cluster.ibm_container_vpc_cluster.cluster: Still creating... [3h0m7s elapsed]
╷
│ Warning: Argument is deprecated
│
│ with module.kms.ibm_kms_key.key,
│ on .terraform/modules/kms/modules/key-protect/main.tf line 23, in resource "ibm_kms_key" "key":
│ 23: resource "ibm_kms_key" "key" {
│
│ Support for creating Policies with the key will soon be removed, Utilise the new resource for creating policies for the keys => ibm_kms_key_policies
╵
╷
│ Error: timeout while waiting for state to become 'Ready' (last state: 'Deploy in progress', timeout: 3h0m0s)
│
│ with module.vpc_ocp_cluster.ibm_container_vpc_cluster.cluster,
│ on .terraform/modules/vpc_ocp_cluster/modules/vpc-openshift/main.tf line 6, in resource "ibm_container_vpc_cluster" "cluster":
│ 6: resource "ibm_container_vpc_cluster" "cluster" {
│
╵
I tried 8 hours later and the cluster is working fine. Maybe it did take longer than 3 hours? I executed another terraform apply
Note: Objects have changed outside of Terraform
Terraform detected the following changes made outside of Terraform since the last "terraform apply":
# module.subnet["us-south-1"].ibm_is_subnet.testacc_subnet has changed
~ resource "ibm_is_subnet" "testacc_subnet" {
~ available_ipv4_address_count = 251 -> 250
id = "0717-1d7d2459-4fbb-4463-8aa2-be60b86a8ae2"
name = "storage00-subnet-0"
tags = []
# (16 unchanged attributes hidden)
}
# module.subnet["us-south-2"].ibm_is_subnet.testacc_subnet has changed
~ resource "ibm_is_subnet" "testacc_subnet" {
~ available_ipv4_address_count = 251 -> 250
id = "0727-9996ccdd-f409-409c-a6b5-533395db9cc1"
name = "storage00-subnet-1"
tags = []
# (16 unchanged attributes hidden)
}
# module.subnet["us-south-3"].ibm_is_subnet.testacc_subnet has changed
~ resource "ibm_is_subnet" "testacc_subnet" {
~ available_ipv4_address_count = 251 -> 250
id = "0737-91b3c2c3-e33d-4bd8-b7f7-c1c7afd41ef3"
name = "storage00-subnet-2"
tags = []
# (16 unchanged attributes hidden)
}
# module.vpc.ibm_is_vpc.testacc_vpc has changed
~ resource "ibm_is_vpc" "testacc_vpc" {
id = "r006-ef48a7f3-f10c-475f-a34e-bde9b4f05063"
name = "storage00-vpc"
~ security_group = [
~ {
~ group_id = "r006-8ed7e518-d700-4c60-81d3-e188f13b2239" -> "r006-24449ec2-bba1-468a-b1ec-fa0aefb4eea1"
~ group_name = "heritage-wheat-divinely-user" -> "kube-r006-ef48a7f3-f10c-475f-a34e-bde9b4f05063"
~ rules = [
~ {
~ port_max = 0 -> 32767
~ port_min = 0 -> 30000
~ protocol = "all" -> "tcp"
~ rule_id = "r006-6433771d-dd1c-455b-934c-9c46393d67a8" -> "r006-74ef7070-90ef-43af-932a-eb4d507c4b55"
# (5 unchanged elements hidden)
},
~ {
~ direction = "inbound" -> "outbound"
~ port_max = 0 -> 32767
~ port_min = 0 -> 30000
~ protocol = "all" -> "udp"
~ remote = "r006-8ed7e518-d700-4c60-81d3-e188f13b2239" -> "0.0.0.0/0"
~ rule_id = "r006-aedcf72f-9f05-4f7d-ac7e-bdda8604fb3d" -> "r006-eb28c97d-5aba-49f1-baa3-4b83e7312357"
# (3 unchanged elements hidden)
},
+ {
+ code = 0
+ direction = "inbound"
+ ip_version = "ipv4"
+ port_max = 65535
+ port_min = 1
+ protocol = "tcp"
+ remote = "0.0.0.0/0"
+ rule_id = "r006-e7187a8b-888d-42ff-9018-03306046ca32"
+ type = 0
},
+ {
+ code = 0
+ direction = "inbound"
+ ip_version = "ipv4"
+ port_max = 65535
+ port_min = 1
+ protocol = "udp"
+ remote = "0.0.0.0/0"
+ rule_id = "r006-6a1574b7-3996-40ff-b70d-f54c31b9d450"
+ type = 0
},
]
},
+ {
+ group_id = "r006-8ed7e518-d700-4c60-81d3-e188f13b2239"
+ group_name = "heritage-wheat-divinely-user"
+ rules = [
+ {
+ code = 0
+ direction = "inbound"
+ ip_version = "ipv4"
+ port_max = 0
+ port_min = 0
+ protocol = "all"
+ remote = "r006-8ed7e518-d700-4c60-81d3-e188f13b2239"
+ rule_id = "r006-aedcf72f-9f05-4f7d-ac7e-bdda8604fb3d"
+ type = 0
},
+ {
+ code = 0
+ direction = "outbound"
+ ip_version = "ipv4"
+ port_max = 0
+ port_min = 0
+ protocol = "all"
+ remote = "r006-8ed7e518-d700-4c60-81d3-e188f13b2239"
+ rule_id = "r006-f399a7cc-0d77-4b13-852e-ba2f3ff96359"
+ type = 0
},
+ {
+ code = 0
+ direction = "outbound"
+ ip_version = "ipv4"
+ port_max = 0
+ port_min = 0
+ protocol = "all"
+ remote = "161.26.0.0/16"
+ rule_id = "r006-007cddba-ffd7-4ffe-999b-e854f2f63a11"
+ type = 0
},
+ {
+ code = 0
+ direction = "outbound"
+ ip_version = "ipv4"
+ port_max = 0
+ port_min = 0
+ protocol = "all"
+ remote = "10.240.128.0/24"
+ rule_id = "r006-65f2a84c-33fd-41c5-8946-cb8b446b249c"
+ type = 0
},
+ {
+ code = 0
+ direction = "outbound"
+ ip_version = "ipv4"
+ port_max = 0
+ port_min = 0
+ protocol = "all"
+ remote = "10.240.64.0/24"
+ rule_id = "r006-d17e8dd9-f61e-4f9e-92f5-690f186cb0ea"
+ type = 0
},
+ {
+ code = 0
+ direction = "outbound"
+ ip_version = "ipv4"
+ port_max = 0
+ port_min = 0
+ protocol = "all"
+ remote = "166.8.0.0/14"
+ rule_id = "r006-223b129c-0e3d-48a4-bd21-f8e0a4425726"
+ type = 0
},
+ {
+ code = 0
+ direction = "outbound"
+ ip_version = "ipv4"
+ port_max = 0
+ port_min = 0
+ protocol = "all"
+ remote = "10.240.0.0/24"
+ rule_id = "r006-b4acf717-b3d2-4b79-8ea9-e2f4277e4400"
+ type = 0
},
+ {
+ code = 0
+ direction = "inbound"
+ ip_version = "ipv4"
+ port_max = 0
+ port_min = 0
+ protocol = "icmp"
+ remote = "0.0.0.0/0"
+ rule_id = "r006-b2d9ac70-d717-4d6c-ac90-3913e89d1fc6"
+ type = 8
},
+ {
+ code = 0
+ direction = "inbound"
+ ip_version = "ipv4"
+ port_max = 22
+ port_min = 22
+ protocol = "tcp"
+ remote = "0.0.0.0/0"
+ rule_id = "r006-94b25358-1fbd-48cb-9429-487eefad8744"
+ type = 0
},
]
},
+ {
+ group_id = "r006-d664409a-7c91-48f4-b54f-2e3ed20e3c81"
+ group_name = "kube-c81l7aid0uggi0nu0fh0"
+ rules = [
+ {
+ code = 0
+ direction = "outbound"
+ ip_version = "ipv4"
+ port_max = 0
+ port_min = 0
+ protocol = "all"
+ remote = "172.17.0.0/18"
+ rule_id = "r006-a6a58c38-ed1f-43bb-9b34-51ef899b13ab"
+ type = 0
},
+ {
+ code = 0
+ direction = "inbound"
+ ip_version = "ipv4"
+ port_max = 0
+ port_min = 0
+ protocol = "all"
+ remote = "172.17.0.0/18"
+ rule_id = "r006-d170f7c5-7c1e-4bb0-ba0a-74720ee7df71"
+ type = 0
},
+ {
+ code = 0
+ direction = "inbound"
+ ip_version = "ipv4"
+ port_max = 32767
+ port_min = 30000
+ protocol = "tcp"
+ remote = "0.0.0.0/0"
+ rule_id = "r006-c70ae06c-e50a-4d57-b766-eda9309a4dbe"
+ type = 0
},
+ {
+ code = 0
+ direction = "inbound"
+ ip_version = "ipv4"
+ port_max = 32767
+ port_min = 30000
+ protocol = "udp"
+ remote = "0.0.0.0/0"
+ rule_id = "r006-3e5296a1-3acb-485f-bc09-307d91a0cd34"
+ type = 0
},
]
},
]
~ subnets = [
+ {
+ available_ipv4_address_count = 250
+ id = "0717-1d7d2459-4fbb-4463-8aa2-be60b86a8ae2"
+ name = "storage00-subnet-0"
+ status = "available"
+ total_ipv4_address_count = 256
+ zone = "us-south-1"
},
+ {
+ available_ipv4_address_count = 250
+ id = "0727-9996ccdd-f409-409c-a6b5-533395db9cc1"
+ name = "storage00-subnet-1"
+ status = "available"
+ total_ipv4_address_count = 256
+ zone = "us-south-2"
},
+ {
+ available_ipv4_address_count = 250
+ id = "0737-91b3c2c3-e33d-4bd8-b7f7-c1c7afd41ef3"
+ name = "storage00-subnet-2"
+ status = "available"
+ total_ipv4_address_count = 256
+ zone = "us-south-3"
},
]
tags = [
"secure-roks",
"storage00",
]
# (19 unchanged attributes hidden)
}
# module.vpc_ocp_cluster.ibm_container_vpc_cluster.cluster has changed
~ resource "ibm_container_vpc_cluster" "cluster" {
+ albs = []
+ crn = "crn:v1:bluemix:public:containers-kubernetes:us-south:a/713c783d9a507a53135fe6793c37cc74:c81l7aid0uggi0nu0fh0::"
id = "c81l7aid0uggi0nu0fh0"
+ ingress_hostname = "storage00-cluster-e7f2ca73139645ddf61a8702003a483a-0000.us-south.containers.appdomain.cloud"
+ ingress_secret = (sensitive value)
~ kube_version = "4.6_openshift" -> "4.6.48_openshift"
+ master_status = "Ready"
+ master_url = "https://c106-e.us-south.containers.cloud.ibm.com:31703"
name = "storage00-cluster"
+ pod_subnet = "172.17.0.0/18"
+ private_service_endpoint_url = "https://c106.private.us-south.containers.cloud.ibm.com:31703"
+ public_service_endpoint_url = "https://c106-e.us-south.containers.cloud.ibm.com:31703"
+ resource_controller_url = "https://cloud.ibm.com/kubernetes/clusters"
+ resource_crn = "crn:v1:bluemix:public:containers-kubernetes:us-south:a/713c783d9a507a53135fe6793c37cc74:c81l7aid0uggi0nu0fh0::"
+ resource_group_name = "default"
+ resource_name = "storage00-cluster"
+ resource_status = "normal"
+ service_subnet = "172.21.0.0/16"
+ state = "normal"
~ tags = [
- "cluster",
- "secure-roks",
- "storage00",
]
# (12 unchanged attributes hidden)
# (5 unchanged blocks hidden)
}
Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to undo or respond to these changes.
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
-/+ destroy and then create replacement
<= read (data resources)
Terraform will perform the following actions:
# module.configure_cluster_logdna.ibm_ob_logging.logging will be created
+ resource "ibm_ob_logging" "logging" {
+ agent_key = (known after apply)
+ agent_namespace = (known after apply)
+ cluster = (known after apply)
+ crn = (known after apply)
+ daemonset_name = (known after apply)
+ discovered_agent = (known after apply)
+ id = (known after apply)
+ instance_id = "b620dd39-8c97-4315-bb1c-4986cae42a71"
+ instance_name = (known after apply)
+ logdna_ingestion_key = (known after apply)
+ namespace = (known after apply)
+ private_endpoint = true
+ timeouts {}
}
# module.configure_cluster_sysdig.ibm_ob_monitoring.sysdig will be created
+ resource "ibm_ob_monitoring" "sysdig" {
+ agent_key = (known after apply)
+ agent_namespace = (known after apply)
+ cluster = (known after apply)
+ crn = (known after apply)
+ daemonset_name = (known after apply)
+ discovered_agent = (known after apply)
+ id = (known after apply)
+ instance_id = "67987421-09cf-492e-a5c7-d2126e845137"
+ instance_name = (known after apply)
+ namespace = (known after apply)
+ private_endpoint = true
+ sysdig_access_key = (known after apply)
+ timeouts {}
}
# module.patch_monitoring.data.ibm_container_cluster_config.clusterConfig will be read during apply
# (config refers to values not yet known)
<= data "ibm_container_cluster_config" "clusterConfig" {
+ admin_certificate = (sensitive value)
+ admin_key = (sensitive value)
+ ca_certificate = (sensitive value)
+ calico_config_file_path = (known after apply)
+ cluster_name_id = (known after apply)
+ config_dir = "/tmp"
+ config_file_path = (known after apply)
+ host = (known after apply)
+ id = (known after apply)
+ resource_group_id = "b6503f25836d49029966ab5be7fe50b5"
+ token = (sensitive value)
}
# module.patch_monitoring.data.ibm_container_cluster_config.clusterConfigRetry will be read during apply
# (config refers to values not yet known)
<= data "ibm_container_cluster_config" "clusterConfigRetry" {
+ admin_certificate = (sensitive value)
+ admin_key = (sensitive value)
+ ca_certificate = (sensitive value)
+ calico_config_file_path = (known after apply)
+ cluster_name_id = (known after apply)
+ config_dir = "/tmp"
+ config_file_path = (known after apply)
+ host = (known after apply)
+ id = (known after apply)
+ resource_group_id = "b6503f25836d49029966ab5be7fe50b5"
+ token = (sensitive value)
}
# module.patch_monitoring.data.ibm_container_vpc_cluster.cluster will be read during apply
# (config refers to values not yet known)
<= data "ibm_container_vpc_cluster" "cluster" {
+ albs = (known after apply)
+ api_key_id = (known after apply)
+ api_key_owner_email = (known after apply)
+ api_key_owner_name = (known after apply)
+ crn = (known after apply)
+ health = (known after apply)
+ id = (known after apply)
+ ingress_hostname = (known after apply)
+ ingress_secret = (sensitive value)
+ kube_version = (known after apply)
+ master_url = (known after apply)
+ name = (known after apply)
+ pod_subnet = (known after apply)
+ private_service_endpoint = (known after apply)
+ private_service_endpoint_url = (known after apply)
+ public_service_endpoint = (known after apply)
+ public_service_endpoint_url = (known after apply)
+ resource_controller_url = (known after apply)
+ resource_crn = (known after apply)
+ resource_group_id = "b6503f25836d49029966ab5be7fe50b5"
+ resource_group_name = (known after apply)
+ resource_name = (known after apply)
+ resource_status = (known after apply)
+ service_subnet = (known after apply)
+ state = (known after apply)
+ status = (known after apply)
+ tags = (known after apply)
+ worker_count = (known after apply)
+ worker_pools = (known after apply)
+ workers = (known after apply)
}
# module.patch_monitoring.null_resource.patch_sysdig will be created
+ resource "null_resource" "patch_sysdig" {
+ id = (known after apply)
}
# module.patch_monitoring.time_sleep.wait_1m will be created
+ resource "time_sleep" "wait_1m" {
+ create_duration = "1m"
+ id = (known after apply)
}
# module.vpc_ocp_cluster.ibm_container_vpc_cluster.cluster is tainted, so must be replaced
-/+ resource "ibm_container_vpc_cluster" "cluster" {
~ albs = [] -> (known after apply)
~ crn = "crn:v1:bluemix:public:containers-kubernetes:us-south:a/713c783d9a507a53135fe6793c37cc74:c81l7aid0uggi0nu0fh0::" -> (known after apply)
~ id = "c81l7aid0uggi0nu0fh0" -> (known after apply)
~ ingress_hostname = "storage00-cluster-e7f2ca73139645ddf61a8702003a483a-0000.us-south.containers.appdomain.cloud" -> (known after apply)
~ ingress_secret = (sensitive value)
~ kube_version = "4.6.48_openshift" -> "4.6_openshift"
~ master_status = "Ready" -> (known after apply)
~ master_url = "https://c106-e.us-south.containers.cloud.ibm.com:31703" -> (known after apply)
name = "storage00-cluster"
~ pod_subnet = "172.17.0.0/18" -> (known after apply)
~ private_service_endpoint_url = "https://c106.private.us-south.containers.cloud.ibm.com:31703" -> (known after apply)
~ public_service_endpoint_url = "https://c106-e.us-south.containers.cloud.ibm.com:31703" -> (known after apply)
~ resource_controller_url = "https://cloud.ibm.com/kubernetes/clusters" -> (known after apply)
~ resource_crn = "crn:v1:bluemix:public:containers-kubernetes:us-south:a/713c783d9a507a53135fe6793c37cc74:c81l7aid0uggi0nu0fh0::" -> (known after apply)
~ resource_group_name = "default" -> (known after apply)
~ resource_name = "storage00-cluster" -> (known after apply)
~ resource_status = "normal" -> (known after apply)
~ service_subnet = "172.21.0.0/16" -> (known after apply)
~ state = "normal" -> (known after apply)
~ tags = [
+ "cluster",
+ "secure-roks",
+ "storage00",
]
# (12 unchanged attributes hidden)
# (5 unchanged blocks hidden)
}
Plan: 5 to add, 0 to change, 1 to destroy.
module.vpc_ocp_cluster.ibm_container_vpc_cluster.cluster: Destroying... [id=c81l7aid0uggi0nu0fh0]
module.vpc_ocp_cluster.ibm_container_vpc_cluster.cluster: Still destroying... [id=c81l7aid0uggi0nu0fh0, 10s elapsed]
Note: there were no changes made outside of Terraform Expected: resource ibm_container_vpc_cluster would have been created successfully. Or if the timeout was reached then an additional apply would result in a refresh of the ibm_container_vpc_cluster not a delete and an add. Actual: delete and add of ibm_container_vpc_cluster on the second apply
You can find the full terraform configuration here: https://github.com/terraform-ibm-modules/terraform-ibm-cluster/tree/master/examples/secure-roks-cluster
This is the configuration that I used
export TF_VAR_region="us-south"
export TF_VAR_resource_prefix=storage00
export TF_VAR_entitlement=cloud_pak
export TF_VAR_worker_nodes_per_zone=1
export TF_VAR_number_of_addresses=256
export TF_VAR_disable_public_service_endpoint=false
This is particularly bad since this takes so long. It is causing me a days of delay and making the IBM product very difficult to use.
The second apply was successful at deleting and creating a new cluster. Everything went well:
module.vpc_ocp_cluster.ibm_container_vpc_cluster.cluster: Still creating... [1h30m41s elapsed]
module.vpc_ocp_cluster.ibm_container_vpc_cluster.cluster: Creation complete after 1h30m43s [id=c81t1ned0a8lib39bvt0]
module.configure_cluster_logdna.ibm_ob_logging.logging: Creating...
module.configure_cluster_logdna.ibm_ob_logging.logging: Still creating... [10s elapsed]
module.configure_cluster_logdna.ibm_ob_logging.logging: Creation complete after 18s [id=c81t1ned0a8lib39bvt0/b620dd39-8c97-4315-bb1c-4986cae42a71]
module.configure_cluster_sysdig.ibm_ob_monitoring.sysdig: Creating...
module.configure_cluster_sysdig.ibm_ob_monitoring.sysdig: Still creating... [10s elapsed]
module.configure_cluster_sysdig.ibm_ob_monitoring.sysdig: Still creating... [20s elapsed]
module.configure_cluster_sysdig.ibm_ob_monitoring.sysdig: Creation complete after 29s [id=c81t1ned0a8lib39bvt0/67987421-09cf-492e-a5c7-d2126e845137]
module.patch_monitoring.data.ibm_container_vpc_cluster.cluster: Reading...
module.patch_monitoring.time_sleep.wait_1m: Creating...
module.patch_monitoring.data.ibm_container_vpc_cluster.cluster: Read complete after 10s [id=c81t1ned0a8lib39bvt0]
module.patch_monitoring.data.ibm_container_cluster_config.clusterConfig: Reading...
module.patch_monitoring.time_sleep.wait_1m: Still creating... [10s elapsed]
module.patch_monitoring.data.ibm_container_cluster_config.clusterConfig: Read complete after 7s [id=c81t1ned0a8lib39bvt0]
module.patch_monitoring.data.ibm_container_cluster_config.clusterConfigRetry: Reading...
module.patch_monitoring.time_sleep.wait_1m: Still creating... [21s elapsed]
module.patch_monitoring.data.ibm_container_cluster_config.clusterConfigRetry: Read complete after 5s [id=c81t1ned0a8lib39bvt0]
module.patch_monitoring.time_sleep.wait_1m: Still creating... [31s elapsed]
module.patch_monitoring.time_sleep.wait_1m: Still creating... [41s elapsed]
module.patch_monitoring.time_sleep.wait_1m: Still creating... [51s elapsed]
module.patch_monitoring.time_sleep.wait_1m: Creation complete after 1m1s [id=2022-02-09T16:06:33Z]
module.patch_monitoring.null_resource.patch_sysdig: Creating...
module.patch_monitoring.null_resource.patch_sysdig: Provisioning with 'local-exec'...
module.patch_monitoring.null_resource.patch_sysdig (local-exec): Executing: ["/bin/sh" "-c" " export KUBECONFIG=$KUBECONFIG\n kubectl -n ibm-observe set image ds/sysdig-agent sysdig-agent=icr.io/ext/sysdig/agent\n"]
module.patch_monitoring.null_resource.patch_sysdig: Creation complete after 5s [id=817628357169611121]
╷
│ Warning: Argument is deprecated
│
│ with module.kms.ibm_kms_key.key,
│ on .terraform/modules/kms/modules/key-protect/main.tf line 23, in resource "ibm_kms_key" "key":
│ 23: resource "ibm_kms_key" "key" {
│
│ Support for creating Policies with the key will soon be removed, Utilise the new resource for creating policies for the keys => ibm_kms_key_policies
╵
Apply complete! Resources: 5 added, 0 changed, 1 destroyed.
Verify that if the time out occurs during creation that the current state of the resource is persisted so that a future "Refreshing state..." would not cause a delete/add. As you can see the initial timeout was set to 3 hours and the second run took 1.5 hours.
Timeouts are going to be typical for this resource. The delete/add wastes a lot of time.
A provision of an IKS cluster using took under 3 hours and all was well. So it is just a timeout issue. Not sure if there is anything that terraform provider can do for this.
resource "ibm_container_vpc_cluster" "cluster" {