hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[SUPPORT] AWSGlueCatalogSyncClient partition AlreadyExistsException when syncing large table from scratch

Open parisni opened this issue 3 years ago • 3 comments

hudi 0.11.1 spark 3.2.1

I have several hudi tables with > 35k partitions. When running for first time hive sync (meaning adding 35k partition from scratch into glue), I randomly get the bellow error which says the partition already exists which is weird because the table didn't exist yet and the partition is not duplicated in the list.

As a workaround I catched the error in the AWSGlueCatalogSyncClient, but before proposing a PR, I d'like to know if this is expected.

482394 [Driver] INFO  org.apache.hudi.hive.AWSGlueCatalogSyncClient  - Created table <db>.<table_name> : {}
482394 [Driver] INFO  org.apache.hudi.hive.HiveSyncTool  - Schema sync complete. Syncing partitions for <table_name>
482394 [Driver] INFO  org.apache.hudi.hive.HiveSyncTool  - Last commit time synced was found to be null
482394 [Driver] INFO  org.apache.hudi.sync.common.AbstractSyncHoodieClient  - Last commit time synced is not known, listing all partitions in ...
1041177 [Driver] INFO  org.apache.hudi.hive.HiveSyncTool  - Storage partitions scan complete. Found 35337
1042524 [Driver] INFO  org.apache.hudi.hive.AWSGlueCatalogSyncClient  - Adding 35337 partition(s) in table <db>.<table_name>
2547338 [Driver] INFO  org.apache.spark.deploy.yarn.ApplicationMaster  - Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.RuntimeException: org.apache.hudi.exception.HoodieException: Got runtime exception when hive syncing <table_name>
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:737)
Caused by: org.apache.hudi.hive.HoodieHiveSyncException: Failed to sync partitions for table <table_name>
        at org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:497)
        at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:264)
        at org.apache.hudi.hive.HiveSyncTool.doSync(HiveSyncTool.java:172)
        at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:159)
        ... 15 more
Caused by: org.apache.hudi.aws.sync.HoodieGlueSyncException: Fail to add partitions to <db>.<table_name>
        at org.apache.hudi.aws.sync.AWSGlueCatalogSyncClient.addPartitionsToTable(AWSGlueCatalogSyncClient.java:145)
        at org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:479)
        ... 18 more
Caused by: org.apache.hudi.aws.sync.HoodieGlueSyncException: Fail to add partitions to<db>.<table_name>
with error(s): [{PartitionValues: [7, 2021-07-14, 13],ErrorDetail: {ErrorCode: AlreadyExistsException,ErrorMessage: Partition already exists.}}]
        at org.apache.hudi.aws.sync.AWSGlueCatalogSyncClient.addPartitionsToTable(AWSGlueCatalogSyncClient.java:140)
        ... 19 more

parisni avatar Jun 24 '22 10:06 parisni

@parisni the exception of Partition already exists shouldn't happen. Could you provide the HiveSyncTool command with the arguments you use for reproducing the issue?

yihua avatar Jun 28 '22 17:06 yihua

@parisni Any update on this issue? If it's still happening, can you please start from scratch syncing to a different database, and provide the sync tool command that you ran?

codope avatar Aug 01 '22 12:08 codope

I am using the sync command programmatically. Indeed the glue error happens from time to time . glue backend look not that stable, then I guess the sync process should handle those cases better othwrwize it fails and leave the glue metastore corrupted

On August 1, 2022 12:04:13 PM UTC, Sagar Sumit @.> wrote: @. Any update on this issue? If it's still happening, can you please start from scratch syncing to a different database, and provide the sync tool command that you ran?

-- Reply to this email directly or view it on GitHub: https://github.com/apache/hudi/issues/5960#issuecomment-1201111260 You are receiving this because you were mentioned.

Message ID: @.***>

parisni avatar Aug 11 '22 17:08 parisni

thanks!

nsivabalan avatar Aug 28 '22 00:08 nsivabalan