amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[Feature][Flink] Introducing the INSERT OVERWRITE statement for mixed-streaming format tables.

Open YesOrNo828 opened this issue 3 years ago • 2 comments

Search before asking

  • [x] I have searched in the issues and found no similar issues.

What would you like to be improved?

Currently, the insert overwrite statement is supported for mixed-streaming format tables without primary key specification. In order to meet the batch processing capability of the Flink engine on keyed tables.

Mixed-streaming format tables should include mixed-iceberg and mixed-hive format tables.

INSERT OVERWRITE [catalog_name.][db_name.]table_name [column_list] select_statement

column_list:
  (col_name1 [, column_name2, ...])

OVERWRITE

INSERT OVERWRITE will overwrite any existing data in the table or partition. Otherwise, new data is appended.

COLUMN LIST

Given a table T(a INT, b INT, c INT), Flink supports INSERT INTO T(c, b) SELECT x, y FROM S. The expectation is that ‘x’ is written to column ‘c’ and ‘y’ is written to column ‘b’ and ‘a’ is set to NULL (assuming column ‘a’ is nullable).

How should we improve?

Flink API should implement the interface: SupportsOverwrite;

This feature only works in flink batch runtime mode.

Affected Flink versions: flink1.12/flink1.14/flink1.15.

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Subtasks

No response

Code of Conduct

  • [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)

YesOrNo828 avatar Jul 19 '22 03:07 YesOrNo828

@xujiangfeng001 I wonder if the work is still moving forward?

czy006 avatar Mar 20 '24 12:03 czy006

@xujiangfeng001 I wonder if the work is still moving forward?

Hi @czy006 , I'm very sorry, I don't have time to continue advancing this issue recently. Can you help me push it forward ?

xujiangfeng001 avatar Mar 25 '24 10:03 xujiangfeng001