amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[Bug]: Mixed Hive Table can't Sync Hive data properly

Open nicochen opened this issue 2 years ago • 2 comments

What happened?

image As you can see ArcticTableFlag will be set to true when a Hive partition has previously written any data through Amoro. But when I delete the data from this partition, write it again with Hive and try to synchronize it to the Mixed Hive table, the files cannot be added to the Mixed Hive table with this “if” logic, because there is no data in this partition of the Mixed Hive table so filesMap.get(partitionData) == null at the same time ArcticTableFlag exists because the Hive partition has not been deleted and data has been written to it. So I think there is a problem with this logic.

Affects Versions

master

What engines are you seeing the problem on?

Core

How to reproduce

  1. Create a Mixed Hive Table with partition
  2. Insert overwrite some data
  3. Delete the data insert overwrite before
  4. Insert into the same data with Hive
  5. Use HiveDataSync to sync step4's data to Mixed Hive Table

Relevant log output

No response

Anything else

No response

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

nicochen avatar Aug 23 '23 08:08 nicochen

@nicochen Thanks for reporting this issue.

AFAIK, the reason why it is necessary to check whether a Hive partition has an Arctic Flag during the process of synchronizing Hive data when a new Hive partition is detected is:

  • If the deletion of a partition on a Mixed-Hive table results in a successful submission to Iceberg but a failed submission to Hive, AMS will detect and delete the corresponding data in Hive.
  • The If logic here is to distinguish between two scenarios.

Based on this, when deleting data under a Hive partition, we may need to delete the ARCTIC FLAG in the Partition meta in HMS.

zhoujinsong avatar Aug 28 '23 02:08 zhoujinsong

@nicochen Do you still have this problem after using the new version

czy006 avatar Apr 15 '24 10:04 czy006