[SUPPORT] AWSGlueCatalogSyncClient partition AlreadyExistsException when syncing large table from scratch
hudi 0.11.1 spark 3.2.1
I have several hudi tables with > 35k partitions. When running for first time hive sync (meaning adding 35k partition from scratch into glue), I randomly get the bellow error which says the partition already exists which is weird because the table didn't exist yet and the partition is not duplicated in the list.
As a workaround I catched the error in the AWSGlueCatalogSyncClient, but before proposing a PR, I d'like to know if this is expected.
482394 [Driver] INFO org.apache.hudi.hive.AWSGlueCatalogSyncClient - Created table <db>.<table_name> : {}
482394 [Driver] INFO org.apache.hudi.hive.HiveSyncTool - Schema sync complete. Syncing partitions for <table_name>
482394 [Driver] INFO org.apache.hudi.hive.HiveSyncTool - Last commit time synced was found to be null
482394 [Driver] INFO org.apache.hudi.sync.common.AbstractSyncHoodieClient - Last commit time synced is not known, listing all partitions in ...
1041177 [Driver] INFO org.apache.hudi.hive.HiveSyncTool - Storage partitions scan complete. Found 35337
1042524 [Driver] INFO org.apache.hudi.hive.AWSGlueCatalogSyncClient - Adding 35337 partition(s) in table <db>.<table_name>
2547338 [Driver] INFO org.apache.spark.deploy.yarn.ApplicationMaster - Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.RuntimeException: org.apache.hudi.exception.HoodieException: Got runtime exception when hive syncing <table_name>
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:737)
Caused by: org.apache.hudi.hive.HoodieHiveSyncException: Failed to sync partitions for table <table_name>
at org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:497)
at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:264)
at org.apache.hudi.hive.HiveSyncTool.doSync(HiveSyncTool.java:172)
at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:159)
... 15 more
Caused by: org.apache.hudi.aws.sync.HoodieGlueSyncException: Fail to add partitions to <db>.<table_name>
at org.apache.hudi.aws.sync.AWSGlueCatalogSyncClient.addPartitionsToTable(AWSGlueCatalogSyncClient.java:145)
at org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:479)
... 18 more
Caused by: org.apache.hudi.aws.sync.HoodieGlueSyncException: Fail to add partitions to<db>.<table_name>
with error(s): [{PartitionValues: [7, 2021-07-14, 13],ErrorDetail: {ErrorCode: AlreadyExistsException,ErrorMessage: Partition already exists.}}]
at org.apache.hudi.aws.sync.AWSGlueCatalogSyncClient.addPartitionsToTable(AWSGlueCatalogSyncClient.java:140)
... 19 more
@parisni the exception of Partition already exists shouldn't happen. Could you provide the HiveSyncTool command with the arguments you use for reproducing the issue?
@parisni Any update on this issue? If it's still happening, can you please start from scratch syncing to a different database, and provide the sync tool command that you ran?
I am using the sync command programmatically. Indeed the glue error happens from time to time . glue backend look not that stable, then I guess the sync process should handle those cases better othwrwize it fails and leave the glue metastore corrupted
On August 1, 2022 12:04:13 PM UTC, Sagar Sumit @.> wrote: @. Any update on this issue? If it's still happening, can you please start from scratch syncing to a different database, and provide the sync tool command that you ran?
-- Reply to this email directly or view it on GitHub: https://github.com/apache/hudi/issues/5960#issuecomment-1201111260 You are receiving this because you were mentioned.
Message ID: @.***>
thanks!