terraform-provider-rancher2
terraform-provider-rancher2 copied to clipboard
[BUG] Scaling up nodes on downstream RKE1 cluster causes cluster (intermittently) to "hang" indefinitely
Rancher Server Setup
- Rancher version:
v2.6.12-rc1 - Installation option (Docker install/Helm Chart):
HA Helm w/ RKE1 local and RKE v1.3.19 - Proxy/Cert Details:
byo-valid
Information about the Cluster
- Kubernetes version:
v1.24.10-rancher4-1 - Cluster Type (Local/Downstream):
Downstream EC2 RKE1 w/ individual roles-[1 etcd,1 cp,1 wkr.. then scale to3 etcd,2 cp,3 wkr
User Information
- What is the role of the user logged in?
Admin
Provider Information
- What is the version of the Rancher v2 Terraform Provider in use?
2.0.0 - What is the version of Terraform in use?
0.13.7
Describe the bug
When provisioning a downstream EC2 RKE1 cluster w/ individual roles, the cluster successfully provisions. Attempting to then scale up the nodes, sometimes results in the cluster hanging, indefinitely. This is not seen via Rancher UI. (I was only able to encounter this when using rancher2 provider)
To Reproduce
- Fresh install of rancher
v2.6.12-rc1 - Using rancher2 TFP
2.0.0, provision a downstream EC2 RKE1 cluster,v1.24.10-rancher4-1, w/ 1 etcd, 1 cp, and 1 wkr - Once
active, scale up nodes (via TF) to 3 etcd, 2 cp, 3 wkr - Reproduced
Actual Result
cluster hangs indefinitely, scale up never achieved
Expected Result
cluster expected to scale up nodes successfully
Screenshots
Cluster Management

Provisioning logs

Additional context
Its possible this affects RKE1 across multiple providers, but initially seen w/ EC2. I will attempt to reproduce w/ Linode and confirm shortly (in comment below) if that is affected as well.
issue seen w/ Linode as well - [not EC2 specific]
@Josh-Diamond Do you only see this when prov TF clusters or also via the UI?
I will work on reproducing this issue.