[AWS::Glue::TableOptimizer] - [Docs] - Properties are completely incorrect
Name of the resource
AWS::Glue::TableOptimizer
Resource name
No response
Reference Link
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-tableoptimizer.html
Details
- Type:
The type of table optimizer. Currently, the only valid value is compaction.
Required: Yes
Type: String
It's wrong, three values are valid:
-
compaction -
retention -
orphan_file_deletion
CLI Reference: https://docs.aws.amazon.com/cli/latest/reference/glue/create-table-optimizer.html I've been able to use all three values successfully and get the expected result.
- TableOptimizerConfiguration.RetentionConfiguration.IcebergConfiguration:
Location: String
OrphanFileRetentionPeriodInDays: Integer
It's not. Actually correct structure is:
SnapshotRetentionPeriodInDays: Integer
NumberOfSnapshotsToRetain: Integer
CleanExpiredFiles: Boolean
Also CLI as reference. Also tested.
- TableOptimizerConfiguration based on Type
- If
Typeiscompaction, you cannot specify eitherOrphanFileDeletionConfigurationorRetentionConfiguration - If
Typeisretention, you have to provideRetentionConfiguration - If
Typeisorphan_file_deletion, you have to provideOrphanFileDeletionConfiguration
The CDK implementation of CFN resources is also poor due to incorrect properties, but I have a workaround for folks who want to automate TableOptimizer:
const cfnTableOptimizerCompaction = new CfnTableOptimizer(testStack, 'TableOptimizerCompaction', {
catalogId: catalogId, //essentially AWS Account Id
databaseName: dbName,
tableName: tableName,
type: 'compaction',
tableOptimizerConfiguration: {
enabled: true,
//correct permissions here: https://docs.aws.amazon.com/glue/latest/dg/optimization-prerequisites.html
roleArn: compactionRoleArn
}
});
const cfnTableOptimizerOrphanFileDeletion = new CfnTableOptimizer(testStack, 'TableOptimizerOrphanFileDeletion', {
catalogId: catalogId,
databaseName: dbName,
tableName: tableName,
type: 'orphan_file_deletion',
tableOptimizerConfiguration: {
orphanFileDeletionConfiguration: {
icebergConfiguration: {
// ex: 's3://some-bucket/table', can be found in Table Details in Glue Console
location: tableLocation,
orphanFileRetentionPeriodInDays: 1
}
},
enabled: true,
roleArn: orphanFileDeletionRoleArn
}
});
const cfnTableOptimizerRetention = new CfnTableOptimizer(testStack, 'TableOptimizerRetention', {
catalogId: catalogId,
databaseName: dbName,
tableName: tableName,
type: 'retention',
tableOptimizerConfiguration: {
enabled: true,
roleArn: retentionRoleArn
}
});
cfnTableOptimizerRetention.addOverride('Properties.TableOptimizerConfiguration.RetentionConfiguration.IcebergConfiguration', {
SnapshotRetentionPeriodInDays: 1,
NumberOfSnapshotsToRetain: 1,
CleanExpiredFiles: true
})
Both compaction and orphanfiledeletion is working but when I try retention stack errors out with Internal Failure.
After synth I am seeing this in cfn template
"table1712362891retention": { "Type": "AWS::Glue::TableOptimizer", "Properties": { "CatalogId": "3xxxxxxxxxxx", "DatabaseName": "sagemaker_featurestore", "TableName": "table1712362891", "TableOptimizerConfiguration": { "Enabled": true, "RetentionConfiguration": { "IcebergConfiguration": { "SnapshotRetentionPeriodInDays": 2, "NumberOfSnapshotsToRetain": 1, "CleanExpiredFiles": true } }, "RoleArn": "xxxxxxx", }, "Type": "retention" }, "Metadata": { "aws:cdk:path": "xxxxxxxxxx" } },