(glue-alpha): cannot create 2 partitionIndexes simultaneously
Describe the bug
When passing 2 indexes to partitionIndexes of glue.Table, table creation fails.
Expected Behavior
Glue table and indexes are created.
Current Behavior
Table indexes creation fails.
Index index2 is in CREATING state. Only 1 index can be created or deleted simultaneously per table.
Reproduction Steps
Create a glue table with 2 indexes.
const bucket = new s3.Bucket(stack, 'DataBucket');
const database = new glue.Database(stack, 'MyDatabase', {
databaseName: 'database',
});
const csvTable = new glue.Table(stack, 'CSVTable', {
database,
bucket,
tableName: 'csv_table',
columns: [
{ name: 'col1', type: glue.Schema.STRING },
{ name: 'col2', type: glue.Schema.STRING },
{ name: 'col3', type: glue.Schema.STRING },
],
partitionKeys: [
{ name: 'year', type: glue.Schema.SMALL_INT },
{ name: 'month', type: glue.Schema.BIG_INT },
],
partitionIndexes: [
{ indexName: 'index1', keyNames: ['month'] },
{ indexName: 'index2', keyNames: ['month', 'year'] },
],
dataFormat: glue.DataFormat.CSV,
});
It fails sometimes even if only one index is passed to partitionIndexes and the rest is added using table.addPartitionIndex.
const csvTable = new glue.Table(stack, 'CSVTable', {
database,
bucket,
tableName: 'csv_table',
columns: [
{ name: 'col1', type: glue.Schema.STRING },
{ name: 'col2', type: glue.Schema.STRING },
{ name: 'col3', type: glue.Schema.STRING },
],
partitionKeys: [
{ name: 'year', type: glue.Schema.SMALL_INT },
{ name: 'month', type: glue.Schema.BIG_INT },
],
partitionIndexes: [{ indexName: 'index1', keyNames: ['month'] }],
dataFormat: glue.DataFormat.CSV,
});
csvTable.addPartitionIndex({ indexName: 'index2', keyNames: ['month', 'year'] })
Possible Solution
I think this a restriction of Glue service.
Additional Information/Context
No response
CDK CLI Version
2.70.0
Framework Version
No response
Node.js Version
18
OS
macOS Ventura
Language
Typescript
Language Version
No response
Other information
No response
Hi @clueleaf , thanks for reaching out.
Its stated in the available documentation that you can have a maximum of 3 partition indexes in the table. But its also stated here - `
- Partition indexes must be created one at a time. To avoid
- race conditions, we store the resource and add dependencies
- each time a new partition index is created. ` I am also getting the error while creating 2 indexes at the same time but it succeeds when I am adding Partition Index later on. Since workaround is there, currently I am marking this as P2 which means our team won't be able to work on it immediately. However if you would like to contribute to resolving this bug, that would be great. Here is a contributing guide to get started.
We also use +1s to help prioritize our work, and are happy to re-evaluate this issue based on community feedback. You can reach out to the cdk.dev community on Slack to solicit support for re-prioritization. (edited)
@khushail Thank you for your investigation.
One wired thing is that even if I use addPartitionIndex to add index later on, it fails just as the same.
It's hard to tell why it succeeds sometimes but not always.
const bucket = new s3.Bucket(stack, 'DataBucket');
const database = new glue.Database(stack, 'MyDatabase', {
databaseName: 'database',
});
const csvTable = new glue.Table(stack, 'CSVTable', {
database,
bucket,
tableName: 'csv_table',
columns: [
{ name: 'col1', type: glue.Schema.STRING },
{ name: 'col2', type: glue.Schema.STRING },
{ name: 'col3', type: glue.Schema.STRING },
],
partitionKeys: [
{ name: 'year', type: glue.Schema.SMALL_INT },
{ name: 'month', type: glue.Schema.BIG_INT },
],
partitionIndexes: [{ indexName: 'index1', keyNames: ['month'] }],
dataFormat: glue.DataFormat.CSV,
});
csvTable.addPartitionIndex({ indexName: 'index2', keyNames: ['month', 'year'] })
@clueleaf , could you please share the error that you see when it fails. As I am not able to repro this error, it might be helpful for reference while creating a PR.
Sure.
**:**:** ** | CREATE_FAILED | Custom::AWS | CSVTablepartitionindexindex16247ABF6
Received response status [FAILED] from custom resource. Message returned: Index index2 is in CREATING state. Only 1 index can be created or deleted simultaneously per table. (RequestId: 9a709d0e-4e9d-49e3-8202-fd781b73266b)
❌ MyStack (MyStack) failed: Error: The stack named MyStack failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE: Received response status [FAILED] from custom resource. Message returned: Index index2 is in CREATING state. Only 1 index can be created or deleted simultaneously per table. (RequestId: 9a709d0e-4e9d-49e3-8202-fd781b73266b)
at FullCloudFormationDeployment.monitorDeployment (/Users/***/node_modules/aws-cdk/lib/index.js:380:10236)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async deployStack2 (/Users/***/node_modules/aws-cdk/lib/index.js:383:145458)
at async /Users/***/node_modules/aws-cdk/lib/index.js:383:128776
at async run (/Users/***/node_modules/aws-cdk/lib/index.js:383:126782)
❌ Deployment failed: Error: Stack Deployments Failed: Error: The stack named MyStack failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE: Received response status [FAILED] from custom resource. Message returned: Index index2 is in CREATING state. Only 1 index can be created or deleted simultaneously per table. (RequestId: 9a709d0e-4e9d-49e3-8202-fd781b73266b)
at deployStacks (/Users/***/node_modules/aws-cdk/lib/index.js:383:129083)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async CdkToolkit.deploy (/Users/***/node_modules/aws-cdk/lib/index.js:383:147507)
at async exec4 (/Users/***/node_modules/aws-cdk/lib/index.js:438:51799)
Stack Deployments Failed: Error: The stack named MyStack failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE: Received response status [FAILED] from custom resource. Message returned: Index index2 is in CREATING state. Only 1 index can be created or deleted simultaneously per table. (RequestId: 9a709d0e-4e9d-49e3-8202-fd781b73266b)
thanks @clueleaf .
I have same issue, it worked previously.
IMO, the best thing is to avoid returning nothing in the addPartitionIndex function and instead return the object, so then we could chain dependencies between the two indexes.
Something like this (currently doesn't work because it returns void):
const table = new S3Table(this, 'Something', {
.
.
.
});
const pI1 = table.addPartitionIndex({
indexName: 'year_month_day',
keyNames: ['year', 'month', 'day']
});
const pI2 = table.addPartitionIndex({
indexName: 'country_site',
keyNames: ['country', 'site']
});
pI1.addDependency(pI2); # Does't work because pI1 and pI2 are void