paimon
paimon copied to clipboard
[spark] Spark 3.3 throws write exceptions during partial-update engine operations
Purpose
Linked issue: https://github.com/apache/paimon/issues/5932
fix bug with Spark 3.3 throws write exceptions during partial-update engine operations
CREATE TABLE T(
f1 int,
f2 string,
f3 string,
f4 string)
TBLPROPERTIES (
'bucket'='1',
'primary-key'='f1',
'write-only'='true',
'merge-engine'='partial-update',
'file.format'='parquet',
'file.compression' = 'zstd'
)
INSERT INTO T(f1, f4) VALUES(1, 'test')
Tests
spark.sql(
"CREATE TABLE T( f1 int, f2 string, f3 string, f4 string)\n" +
"TBLPROPERTIES (\n" + " 'bucket'='1',\n" +
" 'primary-key'='f1',\n" +
" 'write-only'='true',\n" +
" 'merge-engine'='partial-update',\n" + "" +
" 'file.format'='parquet',\n" +
" 'file.compression' = 'zstd'\n" + ")")
spark.sql("INSERT INTO T(f1, f4) VALUES(1, 'test')")
checkAnswer(
spark.sql("SELECT * FROM T"),
Row(1, null, null, "test")
)
API and Format
Documentation
Hi @Aiden-Dong , I ran this test in current master, works fine. Can you just add this test?
Hi @Aiden-Dong , I ran this test in current master, works fine. Can you just add this test?
Hi, to confirm - this partial column update test works fine for you on master/spark-3.3? I encountered issues with it on master recently, so just double-checking before I add the test.