tfmigrate
tfmigrate copied to clipboard
"compute a new state" takes a very long time
Planning a state migration with ~150 operations (about 50% import
and 50% rm
) takes a very long time (~2 hours in a 4 vCPU / 16 GiB container). A bash script running the same operations takes about 25 minutes, and that's a very naive script which locks and unlocks remote state for every operation! I would expect tfmigrate to be much faster since it works on a local copy of the state.
I set TF_CLI_ARGS_plan='-refresh=false -parallelism=32'
but it didn't seem to help.
Here's the log of a run which ended up failing after about 2 hours due to the provider documentation about the import
ID being wrong. 🤦🏻
exit status 1: running "cd $(git rev-parse --show-toplevel) && TF_CLI_ARGS_plan='-refresh=false -parallelism=32' tfmigrate plan && touch $PLANFILE" in "/var/lib/atlantis/repos/REDACTED/default/tfmigrate":
2022/03/01 01:31:19 [INFO] Attempting to use session-derived credentials
2022/03/01 01:31:19 [INFO] Successfully derived credentials from session
2022/03/01 01:31:19 [INFO] AWS Auth provider used: "CredentialsEndpointProvider"
2022/03/01 01:31:20 [INFO] [runner] unapplied migration files: [redacted.hcl]
2022/03/01 01:31:20 [INFO] [runner] load migration file: tfmigrate/redacted.hcl
2022/03/01 01:31:20 [INFO] [migrator] start state migrator plan
2022/03/01 01:31:20 [INFO] [migrator@.] terraform version: 1.1.6
2022/03/01 01:31:20 [INFO] [migrator@.] initialize work dir
2022/03/01 01:32:50 [INFO] [migrator@.] get the current remote state
2022/03/01 01:33:12 [INFO] [migrator@.] override backend to local
2022/03/01 01:33:12 [INFO] [executor@.] create an override file
2022/03/01 01:33:12 [INFO] [migrator@.] creating local workspace folder in: terraform.tfstate.d/default
2022/03/01 01:33:12 [INFO] [executor@.] switch backend to local
2022/03/01 01:33:18 [INFO] [migrator@.] compute a new state
2022/03/01 03:31:53 [INFO] [migrator@.] check diffs
2022/03/01 03:35:13 [INFO] [executor@.] remove the override file
2022/03/01 03:35:13 [INFO] [executor@.] remove the workspace state folder
2022/03/01 03:35:13 [INFO] [executor@.] switch back to remote
Timestamps of the slow part emphasised:
2022/03/01 01:33:18 [INFO] [migrator@.] compute a new state 2022/03/01 03:31:53 [INFO] [migrator@.] check diffs
I am currently running a plan with TFMIGRATE_LOG=DEBUG
and will update the ticket when it completes in a few hours.
Hi @jbg, thank you for reporting this.
I wasn't aware of a performance issue because the most of typical my use cases have less than 10 operations. At the same time, I'm also aware of breaking changes in AWS provider v4 and I expect hundreds of imports to be required for me in the near future. When I run into the performance issue too, I'll investigate further.
For those who are already facing performance issues, sharing your benchmark will help with debugging.