aws-cdk icon indicating copy to clipboard operation
aws-cdk copied to clipboard

S3: Table Buckets support

Open Smotrov opened this issue 11 months ago • 10 comments

Describe the feature

Enable CDK to facilitate the creation of newly announced table buckets.

Use Case

The creation of table buckets will facilitate the utilization of a recently introduced feature by Amazon S3, which enables the storage of data in a columnar format. This feature is automatically maintained by AWS.

Proposed Solution

No response

Other Information

No response

Acknowledgements

  • [ ] I may be able to implement this feature request
  • [ ] This feature might incur a breaking change

CDK version used

Latest

Environment details (OS name and version, etc.)

MacOS

Smotrov avatar Jan 22 '25 10:01 Smotrov

CFN doc - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/AWS_S3Tables.html And we have L1 modules now https://github.com/aws/aws-cdk/tree/64694124e37113eaeed50635e5d1fb8db9badc89/packages/aws-cdk-lib/aws-s3tables I assume this FR is for L2 support, I am making this a p2 FR, please help us 👍 on the issue(not this comment) to prioritize.

pahud avatar Jan 22 '25 16:01 pahud

Hi @pahud ! Indeed the support for L2 would be really amazing for those who are looking for this feature since the re:Invent 24!

Edit: After digging into the CloudFormation link, I have only found a support for TableBucket creation, while I was under impression that creation of tables were also supported, which isn't the case.

kdiri avatar Jan 27 '25 10:01 kdiri

Meanwhile, it would be just grate to be able to create a table namespace using CDK as well.

Smotrov avatar Mar 10 '25 12:03 Smotrov

the lack of ability to manage namespaces (and tables) using cdk is a major blocker for us adopting S3 table buckets, is this being actively worked on?

dominicrathbone avatar Apr 04 '25 15:04 dominicrathbone

PR #33599 adds the aws-s3tables-alpha module with L2 construct support for TableBucket and TableBucketPolicy.

the lack of ability to manage namespaces (and tables) using cdk is a major blocker for us adopting S3 table buckets, is this being actively worked on?

Yes, CDK support for Tables & Namespaces is in the works.

xuxey avatar May 20 '25 19:05 xuxey

any update on tables & namespaces? this would be amazing. Even CFN constructs would unblock us

anthonysgro avatar Jun 15 '25 03:06 anthonysgro

A temporary workaround for this is using custom resources against the s3tables sdk or iceberg rest catalog API. We are using the S3 tables sdk for namespace management and iceberg rest catalog API for managing tables as it allows you to set more complex schemas, identity fields and partition specs whereas the s3tables sdk/API is fairly limited. Caveat to this is mapping an update event to the catalog API is fairly complex.

dominicrathbone avatar Jun 15 '25 08:06 dominicrathbone

This is interesting. Yeah I agree that the S3 tables sdk is pretty limited. For example, I can't find any input property that allows me to indicate partitions: https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/s3tables/command/CreateTableCommand. That alone makes the S3tables sdk a deal breaker for my use case. Looks like I'll try the iceberg rest api for now

anthonysgro avatar Jun 16 '25 16:06 anthonysgro

@dominicrathbone how do you use the REST API to specify the iceberg table belongs to the S3 table bucket / namespace and not just the glue catalog? I am assuming if you specify the "location" of the table bucket / namespace arn in the CreateTableRequest it will use that?

There's not really any AWS documentation on this outside of this link but it doesn't include these details

Edit: Nevermind, I see https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-integrating-open-source.html this exists

anthonysgro avatar Jun 16 '25 18:06 anthonysgro

@dominicrathbone how do you use the REST API to specify the iceberg table belongs to the S3 table bucket / namespace and not just the glue catalog? I am assuming if you specify the "location" of the table bucket / namespace arn in the CreateTableRequest it will use that?

There's not really any AWS documentation on this outside of this link but it doesn't include these details

Edit: Nevermind, I see https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-integrating-open-source.html this exists

You are looking at the glue iceberg rest catalog API, I used the s3tables catalog api https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-integrating-open-source.html

dominicrathbone avatar Jun 16 '25 18:06 dominicrathbone

Hello, when will this feature be merged into the cdk?

mark-hayward avatar Aug 27 '25 13:08 mark-hayward

CDK support for Table Buckets, Namespaces, and Tables is now live: https://aws.amazon.com/about-aws/whats-new/2025/08/amazon-s3-tables-cloudformation-cdk/

xuxey avatar Aug 28 '25 17:08 xuxey

How to configure partitioning and sort order while creating Table? I couldn't find any example for that. Not sure if that's supported right now. Can we provide partitioning and sort order configuration inside iceberg_metadata ? https://repost.aws/questions/QUJUwgQHZQTc-DzZYY-q4i3Q/cdk-s3-table-bucket-apache-iceberg-table-creation-with-partitioning-and-other-advanced-configuration

Update: I had to go with custom pyiceberg CDK resource due to these limitations to correctly provision my s3 table.

armujahid avatar Sep 23 '25 14:09 armujahid

This issue has received a significant amount of attention so we are automatically upgrading its priority. A member of the community will see the re-prioritization and provide an update on the issue.

github-actions[bot] avatar Sep 28 '25 00:09 github-actions[bot]

@xuxey Is the partition key configuration for s3 tables supported through cdk now? or can you let me know if its in the roadmap ahead?

kowshikk2 avatar Nov 05 '25 16:11 kowshikk2