terraform-aws-rds-aurora icon indicating copy to clipboard operation
terraform-aws-rds-aurora copied to clipboard

feat: Add global_upgradable variable to support major engine version upgrades to global clusters.

Open theherk opened this issue 1 year ago β€’ 12 comments

Fix #425. A much more thorough explanation is provided in examples/global-cluster/README.md.

I recognize this adds complexity just to ignore engine_version, but ... I don't know another way around this issue.

Upgrading major version of global clusters

Upgrading the major version of global clusters is possible, but due to a limitation in terraform, it requires some special consideration. As documented in the provider:

When you upgrade the version of an aws_rds_global_cluster, Terraform will attempt to in-place upgrade the engine versions of all associated clusters. Since the aws_rds_cluster resource is being updated through the aws_rds_global_cluster, you are likely to get an error (Provider produced inconsistent final plan). To avoid this, use the lifecycle ignore_changes meta argument as shown below on the aws_rds_cluster.

In order to accomplish this in a module that is otherwise used for non-global clusters, we must duplicate the cluster resource. The limitation that requires this is, terraform lifecycle meta-arguments can contain only literal values:

The lifecycle settings all affect how Terraform constructs and traverses the dependency graph. As a result, only literal values can be used because the processing happens too early for arbitrary expression evaluation.

That means, that to ignore the engine_version in some cases but not in others, we need another resource. So, if you intend to upgrade your global cluster in the future, you must set the new variable global_upgradable to true.

Migrating the resource

If you already have a global cluster created with this module, and would like to make use of this feature, you'll need to move the cluster resource. That can be done with the cli:

terraform state mv 'module.this.aws_rds_cluster.this[0]' 'module.this.aws_rds_cluster.global_upgradable[0]'

Or via a new moved block:

moved {
  from = module.this.aws_rds_cluster.this[0]
  to   = module.this.aws_rds_cluster.global_upgradable[0]
}

After that, changing the major version should work without issue.

Breaking Changes

This is not a breaking change, but if people start using this variable without exercising caution they could wipe a database, but that is the nature of it all I suppose. It could possibly be considered a bug fix on the other hand.

How Has This Been Tested?

  • [βœ“] I have updated at least one of the examples/* to demonstrate and validate my change(s)
  • [βœ“] I have tested and validated these changes against several internal existing implementations.
  • [βœ“] I have executed pre-commit run -a on my pull request

theherk avatar Feb 01 '24 15:02 theherk

This PR has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this PR will be closed in 10 days

github-actions[bot] avatar Mar 03 '24 00:03 github-actions[bot]

My patch has been updated to resolve conflicts. That doesn't seem to remove the stale label, so I comment to notify.

theherk avatar Mar 03 '24 15:03 theherk

We are currently facing issues with this in a client environment. Hoping this gets merged soon. 🀞

jeff-french avatar Mar 04 '24 20:03 jeff-french

This PR has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this PR will be closed in 10 days

github-actions[bot] avatar Apr 04 '24 00:04 github-actions[bot]

Still awaiting review.

theherk avatar Apr 04 '24 05:04 theherk

This PR has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this PR will be closed in 10 days

github-actions[bot] avatar May 05 '24 00:05 github-actions[bot]

I'll be happy to resolve conflicts once any action happens here. Until then I'll just continue to keep this alive and maintain the fork.

theherk avatar May 10 '24 11:05 theherk

This PR has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this PR will be closed in 10 days

github-actions[bot] avatar Jun 10 '24 00:06 github-actions[bot]

Still waiting.

theherk avatar Jun 10 '24 05:06 theherk

Why don't you rely on allow_major_version_upgrade instead of creating yet another variable?

You still gonna have issue with aws_rds_cluster_instance if auto_minor_version_upgrade is true

alisson276 avatar Jun 14 '24 14:06 alisson276

@alisson276 I'm not sure how to be more clear that the description and in the updates in this README file. This does still rely on allow_major_version_upgrade. That alone is simply not enough when using this module, because of the way the AWS API upgrades global cluster. Since the underlying engine version is upgraded at the cluster level, the engine version needs to be ignored. If you have a better way to do so than I have done here, that would be excellent. But if you launch a global cluster and try to upgrade major versions, I think you'll run into this issue, and see that the state gets into an unresolvable condition.


Oh maybe you're suggesting keying off of that already existing variable to trigger the alternate resource, not just that using that variable alone would solve the issue. πŸ’‘ That is an interesting idea. I'll give it some thought.

theherk avatar Jun 14 '24 16:06 theherk

Exactly, I'm suggesting using this variable to be your count, like count = local.create && var. allow_major_version_upgrade ? 1 : 0

The same should be done for the aws_rds_cluster_instance if auto_minor_version_upgrade is set to true. You'd end up with 4 resources, but only two created.

The problem with this approach I'm proposing is it would create a breaking change for the one current using it with this set to true. I don't think this is a problem, because I can use a moved block and everything is fine, but it should be pointed in the module (bumping the major version).

alisson276 avatar Jun 17 '24 06:06 alisson276

This PR has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this PR will be closed in 10 days

github-actions[bot] avatar Jul 18 '24 00:07 github-actions[bot]

This PR was automatically closed because of stale in 10 days

github-actions[bot] avatar Jul 28 '24 00:07 github-actions[bot]

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

github-actions[bot] avatar Aug 27 '24 02:08 github-actions[bot]