amoro
amoro copied to clipboard
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
To support original write and alter to hive table, we need a meta synchronizer to sync meta change and data change to iceberg table in base store, subtask of #38
A subtask of #38
Scenario: Create an iceberg table that has a timestamp-type column with a precision equal to 0 by Flink SQL. Using the below DDL to create a watermark definition table like...
add spark conf ```scala conf.set("arctic.catalog.url", "thrift://xxxx/catalog_name") conf.set("arctic.resolve-all-identifier", "true" ) conf.set("spark.sql.extensions" , "com.netease.arctic.spark.ArcticSparkSessionExtensions") ``` note: new spark conf * arctic.catalog.url : default arctic catalog url for lookup table meta. * arctic.resolve-all-identifier...
```scala df.format("arctic").save("xxxx") df.format("arctic").mode("overwrite").save("xxx") val df = spark.read.format("arctic").load("xxx") ``` keep same action with spark3
support v1 dataframe api ```scala df.format("arctic").save("xxxx") df.format("arctic").mode("overwrite").save("xxx") val df = spark.read.format("arctic").load("xxx") ``` suport v2 dataframe api ```scala df.writeTo("xxx") df.writeTo("xxx").overwritePartitions() val df = spark.read.table("xxx") ```
support insert into sql and dataframe append for keyed table. insert into and append action for keyed table is append to keyed table **change store**
FIX #459 ## Why are the changes needed? Project pushdown may lead to the missing of primary key, partition key. ShuffleHelper may lose the required info, and produce NPE. This...