terraform-provider-aws
terraform-provider-aws copied to clipboard
Keep LATEST aws_ecs_task_definition container_definition image revision
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Description
I'd like to keep a reference to the latest image in use for the task definition (revision) when changing other container definitions. My deployed images on ECR uses a git commit SHA to tag them (like image-name:72937423940das44). A single image is deployed on a service for staging, and after approval, to the production ECS service.
The issue I'm facing is that, when I make changes to the infrastructure, it loses the reference to the current deployed image revision, so I have to change the infrastructure, then re-run the latest deployment on CI to update to the latest image revision.
I did not find any ways to get the current image revision and keep it on "container_definitions -> image" field, and just apply the change on other fields.
If there was a datasource that could retrieve the latest revision from the current task definition I could manually check for it on the image field and use the default ECR url otherwise.
I've tried with aws_ecs_task_definition
datasource, but it only outputs the revision, and the aws_ecs_container_definition
requires the id of the task. Tried other workarounds to set the image to the current used one, but it ends in circular dependency.
New or Affected Resource(s)
- aws_ecs_task_definition
- aws_ecs_service
Potential Terraform Configuration
# Copy-paste your Terraform configurations here - for large Terraform configs,
# please use a service like Dropbox and share a link to the ZIP file. For
# security, you can also encrypt the files using our GPG public key.
resource "aws_ecs_task_definition" "ecs_task" {
family = "${var.application_name}-${var.environment}"
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
memory = var.ecs_task_memory
cpu = var.ecs_task_cpu
execution_role_arn = aws_iam_role.ecs_task_execution_role.arn
task_role_arn = var.task_role_arn
tags = local.tags
depends_on = [
aws_cloudwatch_log_group.log_group
]
container_definitions = jsonencode(
[
{
name = "${var.application_name}-${var.environment}"
image = data.aws_ecr_repository.ecr_repo.repository_url
essential = true,
portMappings = [
{
containerPort = var.ecs_task_alb_port
hostPort = var.ecs_task_host_port
}
]
environment = var.environment_variables
logConfiguration = {
logDriver = "awslogs"
options = {
awslogs-group : aws_cloudwatch_log_group.log_group.name
awslogs-region : var.region
awslogs-stream-prefix : "${var.application_name_short}"
}
}
}
])
}
References
- #0000
there's a old issue for this https://github.com/hashicorp/terraform-provider-aws/issues/632
the solution could be:
# get image name from the current (previous) definition
data "aws_ecs_task_definition" "previous" {
count = var.first_run ? 0 : 1
task_definition = "${var.application_name}-${var.environment}"
}
data "aws_ecs_container_definition" "previous" {
count = var.first_run ? 0 : 1
task_definition = data.aws_ecs_task_definition.previous[0].family
container_name = "${var.application_name}-${var.environment}"
}
...snip..
container_definitions = [
{
image = var.first_run ? var.task_definition_image : data.aws_ecs_container_definition.previous[0].image
using first_run
to prevent issues when the previous image does not exist
We're using an SSM parameter for this. The resource sets up an initial value (eg. latest) when created, but value changes to this parameter are then ignored. Our CICD pipeline updates the SSM parameter with the deployed image as part of the deployment process. Finally we have an additional data source on this same SSM parameter that is used during the generation of the container definitions to insert back the appropriate latest-deployed image tag.
We're using an SSM parameter for this. The resource sets up an initial value (eg. latest) when created, but value changes to this parameter are then ignored. Our CICD pipeline updates the SSM parameter with the deployed image as part of the deployment process. Finally we have an additional data source on this same SSM parameter that is used during the generation of the container definitions to insert back the appropriate latest-deployed image tag.
This is a great solution. thank you.
I'm trying to set up an ECS deployment pipeline and running into similar issues. IIUC the workarounds so far, and from #632, with the exception of @WhyNotHugo's template idea, all would still create a diff in the task definition, because resource aws_ecs_task_definition
tracks a specific revision. If an external tool (the CI/CD) pipeline creates an identical task definition with a new image, and terraform is able to fetch this new image using an external source (SSM, external tool, data source lookup), the image now differs on the task definition resource that is tracking an old revision.
The change from @GerardSoleCa in #30154 feels like a really great solution to this. It looks like you'd be able to do something like this:
locals {
task_definition_family = var.service_name
container_name = var.service_name
is_lookup = var.image_tag == null
}
// If a var.image_tag is passed in, use it for first-run
// If a var.image_tag is not passed in, look up the deployed container definition
data "aws_ecs_service" "service" {
count = local.is_lookup ? 1 : 0
cluster_arn = var.cluster_arn
service_name = var.service_name
}
data "aws_ecs_container_definition" "container" {
count = local.is_lookup ? 1 : 0
task_definition = data.aws_ecs_service.service[0].task_definition
container_name = local.container_name
}
locals {
deployed_image_tag = local.is_lookup ? try(split(":", data.aws_ecs_container_definition.container[0].image)[1], null) : null
wanted_image_tag = coalesce(var.image_tag, local.deployed_image_tag)
max_task_def_revision = max(aws_ecs_task_definition.ecs_task[0].revision, data.aws_ecs_task_definition.ecs_task[0].revision)
}
// Get the existing task revision, need to depend on the task.
data "aws_ecs_task_definition" "ecs_task" {
task_definition = aws_ecs_task_definition.ecs_task.arn_without_revision
depends_on = [
aws_ecs_task_definition.ecs_task
]
}
resource "aws_ecs_task_definition" "ecs_task" {
family = "${var.application_name}-${var.environment}"
track_latest = true // new from the above PR
// ... omitted for brevity ...
container_definitions = jsonencode(
[
{
name = local.container_name
image = "${var.image_repo}:${var.image_tag}"
}
]
)
}
resource "aws_ecs_service" "ecs_service" {
name = var.service_name
cluster = var.cluster_arn
task_definition = "${local.task_definition_family}:${local.max_task_def_revision}"
}
This provides a path for creating the service on first run. Then on subsequent external CI changes to the image_tag, terraform would be able to pick up the image. Since the task definition is now tracking LATEST, instead of the original revision it created, terraform would not detect a difference. If changes were made to environment variables or other terraform code, terraform would be able to fetch the deployed image.
Without #30154, CI deploys would cause Terraform to detect a diff, and create a new task definition that matches the currently active one. It doesn't cause issues to the service but would trigger a deploy and produce a long diff.
Use the aws_ecs_task_definition data source to re-construct the task definition ARN like this
But you'll need to synchronize the changes made by the two parties (Terraform and whatever else is making task def revisions) - usually this is the image tag which you can store in an SSM parameter. So if you deploy your service like this, Terraform creates a new task def at rev 0
Then, your CI process kicks off and creates a new image version, updates the SSM parameter then creates a new task def version and deploys the changes.
If you re-run Terraform, without task def changes - no changes are detected for the task def since its pulling the latest task def arn from the data source and parsing trick
If you re-run Terraform, with changes to the task definition - it will create a new task def revision but USING the image tag set in the SSM parameter
Its not ideal, but it works. One other fault - you'll need to supply a placeholder image on initial deploy since your pipeline may not have created an image yet (i.e. - no value in the SSM parameter on first deployment)
Oh hey @bryantbiggs, I was actually just opening an issue in https://github.com/terraform-aws-modules/terraform-aws-ecs to ask you about this. I've read your design doc multiple times now 😅
I don't quite understand this part:
If you re-run Terraform, without task def changes - no changes are detected for the task def since its pulling the latest task def arn from the data source and parsing trick
and similarly from your doc
As an alternative, this module does provide a work around that would support an external party making changes to something like the
image
of the container definition. In a scenario where there the service, task definition, and container definition are all managed by Terraform, the following configuration could be used to allow an external party to change theimage
of the container definition without conflicting with Terraform, provided that the external party is also updating the image tag in a shared location that can be retrieved by Terraform (here, we are using SSM Parameter Store)
I'm parsing the image by fetching the actively deployed task definition from data "aws_ecs_container_definition" "container"
instead of SSM, but it should be effectively the same.
IIUC, with the SSM approach, the SSM parameter is updated on deploy. Since your container definition is constructed using this SSM parameter, re-running terraform would construct a new container definition with the updated SSM parameter. This would cause resource "aws_ecs_task_definition" "this"
https://github.com/terraform-aws-modules/terraform-aws-ecs/blob/dee59b733b805f9c16495bf65cc193260f537e47/modules/service/main.tf#LL608C1-L608C1 to have changes that need to be applied. It doesn't matter that the reconstructed task def matches the active one. Terraform still attempts to create a new task definition.
Or a hypothetical sequence of events:
- Initial bootstrap with a placeholder image, let's say
image_tag
istag0
, SSM parameter is set totag0
, and the task definition revision is1
- CI system deploys
tag1
, sets the ssm parameter totag1
, task definition revision is2
- Re-run Terraform.
resource "aws_ecs_task_definition" "this"
is still pointing to revision1
withtag0
, and terraform reconstructs the container definition using the image tag set in the SSM parameter. It will try to create a new task definition with image tagtag1
and rev3
.
Am I misunderstanding something here?
Hi all,
There is one important thing to state @bryantbiggs, the created resource
always points to the fixed task definition revision that was created with that resource
at the creation time. If you don't remove the task definitions in your process, you might be fine, but you will always compare against the same old revision
.
In the case you remove old revisions (sometimes you don't want to keep old stuff there), the resource
in terraform points to a non existing one, so it will always create a new one, because the task definition is no longer existing.
A second issue I might see not using the latest one, is that you check only against container changes for example, but the revision that the resource 'considers/handles' is much older than the task definition you have running in production.
And finally, with the module you are showing, don't you have the chicken-egg issue when starting from 0? I see some kind of circular dependency between the data
and the resource
, but maybe I'm missing something.
I'll try to put a couple of examples:
A. Deleting old Task Definitions
- We create the task definition from Terraform. State points to revision 1
- We update (create a new) task definition with an updated docker image (creates revision 1) and we delete (set to inactive) the revision 1.
- On Terraform plan/apply we will see that the state reference is no longer resolved and plans a new resource creation. Revision 3 would be created
B. Keeping Task Definitions
- We create the task definition from Terraform. State points to revision 1
- We update (create a new) task definition with an updated docker image (creates revision 2) and we keep revision 1
- On Terraform plan/apply no changes will be seen, but not because it checks the latest revision, changes are not seen because we are comparing against revision 1 that already exists.
- We update (create a new) task definition with an updated docker image (creates revision 3 from revision 2) and we keep revision 1 and 2
- On Terraform plan/apply we change one ENV variable, the new Task Definition will be based in revision 1, not revision 2 or 3
And the worst case scenario is that on step 5, we would be updating with an old docker image if we are not synchronising properly the docker image from step 4 and step 5.
So, in any case. The approach I'm trying to cover in this PullRequest would solve both issues. We could track always the latest active Task Definition.
Cheers!
👋 @GerardSoleCa We're excited to see #30154 and appreciate you making the change! Is there anything preventing it from being merged?
👋 @GerardSoleCa We're excited to see #30154 and appreciate you making the change! Is there anything preventing it from being merged?
Still waiting that someone from Hashicorp jumps in and reviews the code. Maybe also helping me writing or guiding me through the unit testing of that part.
If you can upvote the PR I'd appreciate!!
PR has just been approved ! lol
This functionality has been released in v5.37.0 of the Terraform AWS Provider.
Hi @GerardSoleCa,
Have you tried using the 5.37
AWS provider version to track outside changes?
It looks like the behavior is not fully the same as expected to track the latest revision and doesn't trigger re-deploy by Terraform in case of updating the Docker image tag by application CI/CD, for example.
Could you please take a look at the example https://github.com/terraform-aws-modules/terraform-aws-ecs/pull/171#issuecomment-1954488652? Maybe you will help with advice on how to use the introduced changes in https://github.com/hashicorp/terraform-provider-aws/pull/30154 correctly. Thanks in advance!
This introduced a bug in the state mv
command. I was on provider 5.36, and attempted to execute a resource rename, here is what happened:
% terraform state mv module.atlas_cloud_staging.aws_lb.default 'module.atlas_cloud_staging.module.atlas_api_alb.aws_lb.this[0]'
Acquiring state lock. This may take a few moments...
Move "module.atlas_cloud_staging.aws_lb.default" to "module.atlas_cloud_staging.module.atlas_api_alb.aws_lb.this[0]"
Error saving the state: unsupported attribute "track_latest"
The state was not saved. No items were removed from the persisted
state. No backup was created since no modification occurred. Please
resolve the issue above and try again.
Seems like since I was renaming an ALB that it should not have stopped the move, especially since I was not on provider 5.37.
Seems like since I was renaming an ALB that it should not have stopped the move, especially since I was not on provider 5.37.
Is it possible that someone else with access to your state is using 5.37 and updated it? Do you commit your .terraform.lock.hcl file?
% grep hashicorp/aws -A1 .terraform.lock.hcl
provider "registry.terraform.io/hashicorp/aws" {
version = "5.37.0"
% tf init
Initializing modules...
Initializing the backend...
Initializing provider plugins...
- terraform.io/builtin/terraform is built in to Terraform
- Reusing previous version of hashicorp/aws from the dependency lock file
- Using previously-installed hashicorp/aws v5.37.0
nope, I was the only one operating on that repo at that time.
In my opinion, having a separate Terraform resource like aws_ecs_container_definition
with a possibility to ignore changes only of the image
is an option to track container definition, updated outside of Terraform by CI/CD and ignore changes at the Docker image tag, for example, because changes toleration in the aws_ecs_task_definition
Terraform resource is not possible as container definition is as a separate JSON attribute.
An appropriate issue - https://github.com/hashicorp/terraform-provider-aws/issues/17988
Or use a hack with the aws_ecs_task_definition
Terraform data source as
https://github.com/terraform-aws-modules/terraform-aws-ecs/blob/45f532c06488d84f140af36241d164facb5e05f5/modules/service/main.tf#L593-L609
I think the recently merged #30154 does solve the problem. Here's how it works for me:
track_latest = true
image = data.aws_ecs_container_definition.this.image
Terraform is now smart enough to compare the resource to the latest task revision, even if it was deployed outside terraform, i.e. even if it is not tracked by terraform state yet.
So, I think this issue can be closed. 🤷♀️ @dtiziani - did you have a chance to try it out? 🙏
Btw, for curiosity, another workaround I was employing for this was to run:
terraform apply -refresh-only
Terraform was smart enough to only update task definition revision in its state, and so subsequent terraform apply
displayed zero diff. 👌
To make it work on our side we neede to change 2 things:
resource "aws_ecs_task_definition" "task_definition" {
track_latest = true
}
and also to use the ecr_repository resource to select the latest tag pushed:
data "aws_ecr_repository" "ecr" {
name = "ecr-name"
registry_id = "registry-id-if-needed"
}
locals {
image_tag = coalesce(setsubtract(data.aws_ecr_repository.ecr.most_recent_image_tags, ["latest"])...)
}
and use the image_tag
in the task definition content.. This way, Terraform can verify that the latest task definition has the same tag than the latest one pushed to the registry.
Finally I had some time to add this to my configs. It's working, in some cases you might need still to synchronise the image tag, or whatever. There is people using a data resource (does not work for me to avoid chicken-egg-issue). I do use a script to sync the image between the local config and the remote deployed.
But I just need this, the rest is working properly in my case. No more recreations, we only see the updates in place. Also, diffs are somehow better.
pls show working code example for aws_ecs_service and aws_ecs_task_definition[container_definitions] with track_latest = true. Thanks
@alexgoddity here is an example based on the sample aws_ecs_task_definition
from the provider docs:
locals {
registry_id = "1234567890"
# Most recent image is pushed with 2 tags: `latest` and the `git-sha1` value, and we want to use the `git-sha1` to be explicit
image = "${data.aws_ecr.repository.repository_url}:${coalesce(setsubtract(data.aws_ecr_repository.opa_snapshot.most_recent_image_tags, ["latest"])...)}"
# If there's only a single tag pushed, it can be simpler
# image = "${data.aws_ecr.repository.repository_url}:${coalesce(data.aws_ecr_repository.opa_snapshot.most_recent_image_tags)}"
}
data "aws_ecr_repository" "ecr" {
name = "ecr-name"
registry_id = local.registry_id
}
resource "aws_ecs_task_definition" "service" {
family = "service"
track_latest = true
container_definitions = jsonencode([
{
name = "first"
image = local.image
cpu = 10
memory = 512
essential = true
portMappings = [
{
containerPort = 80
hostPort = 80
}
]
},
{
name = "second"
image = "service-second"
cpu = 10
memory = 256
essential = true
portMappings = [
{
containerPort = 443
hostPort = 443
}
]
}
])
volume {
name = "service-storage"
host_path = "/ecs/service-storage"
}
placement_constraints {
type = "memberOf"
expression = "attribute:ecs.availability-zone in [us-west-2a, us-west-2b]"
}
}
Thanks I am also looking for a solution to manage tasks from multiple sources. With terraform and CI pipeline(aws cli) creating new tasks based on the latest, not replace existing one
We are using https://github.com/silinternational/ecs-deploy in the CI which creates a new task on CI iterations based on the last one.
I managed to solve the same problem with the help of track_latest = true
and SSM parameter (hashicorp/aws v5.37.0
+):
resource "aws_ssm_parameter" "image_tag" {
name = "image-tag-name"
type = "String"
value = "latest"
lifecycle {
ignore_changes = [value]
}
}
data "aws_ssm_parameter" "image_tag" {
name = "image-tag-name"
depends_on = [
aws_ssm_parameter.image_tag
]
}
data "aws_ecr_repository" "ecr" {
name = "ecr-name"
}
resource "aws_ecs_task_definition" "app_task" {
family = "task-family"
track_latest = true
container_definitions = <<DEFINITION
[
{
"name": "app-name",
"image": "${data.aws_ecr.repository.repository_url}:${data.aws_ssm_parameter.image_tag.value}",
}
]
DEFINITION
...
}
To deploy the application, I leverage GitHub Actions to update both the task definition (image
) and SSM parameter (image_tag
).
I don't understand how track_latest
is helping.
I deployed via terraform, and then altered my image name outside of terraform. I assumed track_latest = true
would hide the diff and terraform would not want to replace my image with the one that terraform has in it's code - but it did show a diff and did want to replace it.
So it made no difference, so am I misunderstanding how this is supposed to work?
I don't understand how
track_latest
is helping.I deployed via terraform, and then altered my image name outside of terraform. I assumed
track_latest = true
would hide the diff and terraform would not want to replace my image with the one that terraform has in it's code - but it did show a diff and did want to replace it.So it made no difference, so am I misunderstanding how this is supposed to work?
What track_latest is doing is on the plan fetch the latest task definition found in AWS, not using the version defined in the tfstate.
Then using the latest version in the AWS compares with what you have as code in your TF.
So it won't autoupdate the image for you, that is one thing you need to do. Some of us we fetch the image with a data or a script, others fetch the docker image tag using ssm parameters.
What we get with this track_latest, is that no new task definition is created everytime only when there are updates. But it's up to you to sync the docker image tag.
Cheers!
With track_latest = false
I can create a new version of task definition outside of terraform, and there is no diff on the aws_ecs_task_definition
resource.
However, I do get on diff on aws_ecs_service
where I use:
task_definition = aws_ecs_task_definition.ecs_task_definition[each.key].arn_without_revision
but that doesn't actually matter as such. It is annoying but it doesn't alter anything in AWS when I apply.
If I instead specify task_definition = aws_ecs_task_definition.ecs_task_definition[each.key].arn_without_revision
then it sets my service back to using the task that was last defined in terraform (ie not the task last defined outside of terraform).