pulumi-aws
pulumi-aws copied to clipboard
Failed to create and update `controltower.LandingZone`
What happened?
When trying to create an AWS ControlTower LandingZone resource, Pulumi appears to be confused during the resource lifecycle.
On the 1st apply, if the properties manifestJson.centralizedLogging.configurations.accessLogginBucket.retentionDays
and manifestJson.centralizedLogging.configurations.logginBucket.retentionDays
are set as string
, the resource creation will fail and the AWS api returns an error ❌. However, if those properties are set at int
, then the resource is correctly created ✅.
On the 2nd and following updates, if those properties remain as int
(no changes, from the previous successful step) in the code, Pulumi indicates an update on the resource is required and the values should be updated to string
(see screenshot below) ❌.
If the values are switched to string
in the code, then Pulumi doesn't show any changes ❌ because it seems it has recorded the values as string in its state (see state below).
If the values are changed to string
in the code and other modifications are made to the resource, then Pulumi sees changes ✅ but the resource fails to update like it does during the 1st apply ❌ with an API error from AWS.
It appears there's some implicit conversions happening while processing the manifestJson
and how it is stored in the stack state.
Example
source code
import json
import pulumi
import pulumi_aws as aws
root_ou_id = 'something'
log_archive_account = aws.organizations.Account(
'log-archive',
name = 'Audit',
email = '[email protected]',
parent_id = root_ou_id
)
audit_account = aws.organizations.Account(
'audit',
name = 'Audit',
email = '[email protected]',
parent_id = root_ou_id
)
kms_key = aws.kms.Key(
'landing-zone-key',
description = 'A key for a landing zone',
enable_key_rotation = True,
deletion_window_in_days = 30
)
key_policy = pulumi.Output.all(
region=aws.get_region(),
key_id=kms_key.id,
account=aws.get_caller_identity().account_id,
).apply(lambda args: json.dumps(
{
"Version": "2012-10-17",
"Id": "control-tower-landing-zone",
"Statement": [
{
"Sid": "Enable IAM User Permissions",
"Effect": "Allow",
"Principal": {
"AWS": f"arn:aws:iam::{args['account']}:root"
},
"Action": "kms:*",
"Resource": "*"
},
{
"Sid": "Allow Config to use KMS for encryption",
"Effect": "Allow",
"Principal": {
"Service": "config.amazonaws.com"
},
"Action": [
"kms:Decrypt",
"kms:GenerateDataKey"
],
"Resource": f"arn:aws:kms:{args['region']}:{args['account']}:key/{args['key_id']}"
},
{
"Sid": "Allow CloudTrail to use KMS for encryption",
"Effect": "Allow",
"Principal": {
"Service": "cloudtrail.amazonaws.com"
},
"Action": [
"kms:GenerateDataKey*",
"kms:Decrypt"
],
"Resource": f"arn:aws:kms:{args['region']}:{args['account']}:key/{args['key_id']}",
"Condition": {
"StringEquals": {
"aws:SourceArn": f"arn:aws:cloudtrail:{args['region']}:{args['account']}:trail/aws-controltower-BaselineCloudTrail"
},
"StringLike": {
"kms:EncryptionContext:aws:cloudtrail:arn": f"arn:aws:cloudtrail:*:{args['account']}:trail/*"
}
}
}
]
}
))
aws.kms.KeyPolicy(
'landing-zone-key-policy',
key_id = kms_key.id,
policy = key_policy,
opts = pulumi.ResourceOptions(
parent = kms_key
)
)
manifest_json = pulumi.Output.all(
kms_key= kms_key.arn,
log_archive_account=log_archive_account.id,
audit_account=audit_account.id,
region=aws.get_region()
).apply(lambda args: json.dumps(
{
"governedRegions": [f"{args['region']}"],
"organizationStructure": {
"security": {
"name": "Security"
}
},
"centralizedLogging": {
"accountId": f"{args['log_archive_account']}",
"configurations": {
"accessLoggingBucket": {
"retentionDays": 3650
},
"kmsKeyArn": f"{args['kms_key']}",
"loggingBucket": {
"retentionDays": 365
}
},
"enabled": True
},
"securityRoles": {
"accountId": f"{args['audit_account']}"
},
"accessManagement": {
"enabled": True
}
}
))
aws.controltower.LandingZone(
'landing_zone',
manifest_json = manifest_json,
version = '3.3',
opts = pulumi.ResourceOptions(
depends_on = [key_policy, log_archive_account, audit_account]
)
)
stack state
pulumi diff
Output of pulumi about
pulumi/aws
6.24.0
Additional context
Note, creating such resource is difficult in nature as the requirements are
-
root
access to the AWS account - multiple AWS accounts are required
- properly setup AWS Organizations
Contributing
Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).
Possibly related to upstream issue https://github.com/hashicorp/terraform-provider-aws/issues/35763
Unfortunately I could not run a full repro since our sandbox account does not support aws.organizations.Account but I
could run a partial repro based on the provided statefile and program, trying out pulumi diff
.
The provider does not diff or modify retentionDays. If the program and state match (either strings, or numbers), there is no diff.
This is because both providers model manifest as strings.
There is some normalization going on in the provider but it simply asserts that JSON whitespace is irrelevant:
https://github.com/hashicorp/terraform-provider-aws/blob/master/internal/service/controltower/landing_zone.go#L78
What then happens is that this data is passed down to the AWS SDK for Go which uses the Smithy IDL and client generator to talk to the service. Successfully created resources are normalized via read and pulling the manifest from the cloud:
https://github.com/hashicorp/terraform-provider-aws/blob/master/internal/service/controltower/landing_zone.go#L132
It appears that either the Smithy IDL code or the service itself normalizes these values.
I could not locate the Smithy definition for Control Tower.
The recommended way to specify the retentionDays property is using numbers:
https://docs.aws.amazon.com/controltower/latest/userguide/lz-api-launch.html
I believe that if you specify retention days as numbers, Pulumi will work correctly and no further action is required.
We could attempt some Pulumi-level normalization or validation here but it would be preferable if this was done upstream at either the Terraform provider or even better at the AWS SDK for Go layer since they have access and are managing the original Smithy manifest for the data here.
Checking with @aureq this is actually worse for the customer as they're sending numbers in and it still converts to string. We definitely should fix this. Will look into getting creds for an e2e repro.