cloudformation-coverage-roadmap icon indicating copy to clipboard operation
cloudformation-coverage-roadmap copied to clipboard

Glue Iceberg Table: Table is broken after any update

Open padaszewski opened this issue 1 year ago • 22 comments

Name of the resource

AWS::Glue::Table

Resource Name

No response

Issue Description

Hi there! When I try to update something on my iceberg table, the update causes the table to break and the table format to disappear. Basically, it's no longer an iceberg table and no operations on the table are possible.

Expected Behavior

When I update the table, the update does not remove the table input and I can work with the iceberg table as I should.

Observed Behavior

Before the update (after initial deployment): image

After any update: image

Notice the table format prop. Table management prop is also away.

Athena before update: Zrzut ekranu 2024-02-2 o 14 23 06

Athena after update: image image

Test Cases

Simple CDK Stack to reproduce this behavior (uncomment one column to update, or do any other update):

export class CdkTestingStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const myTestDatabase = new CfnDatabase(this, 'myTestDatabase', {
      catalogId: Aws.ACCOUNT_ID,
      databaseInput: {
        name: 'mytestdatabase'
      }
    })

    const myLocationBucket = new Bucket(this, 'myLocationBucket', {
      removalPolicy: RemovalPolicy.DESTROY,
      autoDeleteObjects: true
    })

    const myTestTable = new CfnTable(this, 'myTestTable', {
      databaseName: 'mytestdatabase',
      catalogId: Aws.ACCOUNT_ID,
      tableInput: {
        name: 'mytesttable',
        storageDescriptor: {
          columns: [
            {
              name: 'name',
              type: 'string'
            },
            // {
            //   name: 'ts',
            //   type: 'timestamp'
            // }
          ],
          location: `s3://${myLocationBucket.bucketName}/mytesttable/`,
        },
        tableType: 'EXTERNAL_TABLE',
      },
      openTableFormatInput: {
        icebergInput: {
          metadataOperation: 'CREATE',
          version: '2'
        }
      }
    })

  }
}

Other Details

No response

padaszewski avatar Feb 02 '24 13:02 padaszewski

@sfgarcia @oleksiiburov @dmschauer Tagging You, as You were active on other Iceberg issues. Hope you don't mind. Maybe You have some workaround other than creating this with Athena query.

padaszewski avatar Feb 02 '24 14:02 padaszewski

@padaszewski My workaround would be indeed to use a custom resource with the Athena API (issuing queries via awswrangler in a Lambda function). A custom implementation for creating the table and deleting the table is straight-forward. I already implemented such a custom resource. Covering schema changes to the existing table via this custom resource could also be implemented but it's more complex (would work by comparing existing columns and types to recently supplied columns and types and issuing corresponding ALTER TABLE statements). But I see you're looking for a solution that avoids Athena so I think that won't help here.

dmschauer avatar Feb 02 '24 15:02 dmschauer

Thx @dmschauer for the reply. If AWS doesn't ship this along with the iceberg table partitioning feature request, then there is currently no other way than using athena with CR on deployment to achieve this. Iceberg tables are critical for our use case and it's sad that such a great thing is not well supported via IaC.

padaszewski avatar Feb 05 '24 08:02 padaszewski

Hi @padaszewski. I would also like that AWS fully supported managing Iceberg tables (create/update) through IaC. At my team we don't have our Iceberg tables as IaC (we create and update them with Athena queries) due to this limitation.

sfgarcia avatar Feb 19 '24 15:02 sfgarcia

Hi @sfgarcia, thx for the reply. We decided to do the same, but with CustomResources as IaC.

padaszewski avatar Feb 19 '24 17:02 padaszewski

Just a +1 here, this is still an issue. In addition, when creating a resource with a reference to a schema version, the columns do not appear to be loaded into the metadata file.

svdgraaf avatar Apr 15 '24 12:04 svdgraaf

hey! +1 👀 👀 👀

jhosmanfriasbravo avatar Apr 16 '24 16:04 jhosmanfriasbravo

Same here, would love to be able to create/update partitioned Iceberg tables using the CDK.

blaxx avatar Apr 23 '24 13:04 blaxx

I would love to be able to create/update partitioned Iceberg tables using the CloudFormation/CDK too.

cyberst avatar Apr 26 '24 18:04 cyberst

+1

mehdimld avatar May 03 '24 10:05 mehdimld

+1 big concern for Cepsa's team...

ijtarano avatar May 28 '24 08:05 ijtarano

+1

emiliogarcia-cps avatar May 28 '24 08:05 emiliogarcia-cps

+1

jmartinez-cps avatar May 28 '24 08:05 jmartinez-cps

+1

armaseg avatar May 28 '24 13:05 armaseg

+1

FAGUILERAM2022 avatar Jun 03 '24 08:06 FAGUILERAM2022

+1

aitormagan avatar Jun 03 '24 08:06 aitormagan

+1

JesusAndres2 avatar Jun 03 '24 08:06 JesusAndres2

+1

etjess avatar Jun 05 '24 03:06 etjess

+1

romancepsa avatar Jul 03 '24 12:07 romancepsa

+1

Rizxcviii avatar Jul 03 '24 14:07 Rizxcviii

+1

raycomh avatar Jul 08 '24 10:07 raycomh

+1

Smotrov avatar Aug 07 '24 12:08 Smotrov