amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[spark] Support Batch Insert/Delete/Update/MergeInto

Open baiyangtx opened this issue 2 years ago • 0 comments

Support Insert/Delete/Update/MergeInto SQL in Spark

TaskList (link each sub-issue at each subtask):

  • Design Docs #174

some questions should be clear at design docs. i. pos-delete or eq-delete for unkeyed table? ii. should batch update write log store? iii. insert command write to base store or change store? iv. should we keep hive compatibility visible order for writed snapshot(with flink write)

Implement insert command at top priority. (at next milestone)

  • Support insert for iceberg base store keyed table
  • Support insert for hive base store keyed table
  • Support insert for hive base store unkeyed table (high priority)

Support more batch update command in the next few milestone)

  • Support delete for iceberg base store keyed table

  • Support delete for hive base store keyed table

  • Support delete for hive base store unkeyed table

  • Support update for iceberg base store keyed table

  • Support update for hive base store keyed table

  • Support update for hive base store unkeyed table

  • Support merge-into for iceberg base store keyed table

  • Support merge-into for hive base store keyed table

  • Support merge-into for hive base store unkeyed table

Finally, rewrite code of iceberg base store unkeyed table to keep logical consistency of all table types.

  • Rewrite code for iceberg base store unkeyed table.

baiyangtx avatar Aug 16 '22 02:08 baiyangtx