doris-spark-connector
doris-spark-connector copied to clipboard
Spark 3.3.0 support
Proposed changes
-
Support Spark 3.3.0 Removed log4j 1.x, and uses Spark's Logging trait, which uses log4j 2.x in Sprak 3.3.0. For older Spark versions , this change does not break the compability. Code changes are in
ScalaValueReader.scala
-
Close BufferedReader in DorisStreamLoad When reading Doris BE rest api's response, BufferedReader should be closed in
DorisStreamLoad
, function:loadBatch
-
Change spark.minor.version to spark.major.version In pom.xml, the property
spark.minor.version
is actually spark major version. -
source jar to include scala code changes in pom.xml scala-maven-plugin
Issue Number: close #xxx
Problem Summary:
This pr upgrades the code to support Spark 3.3.0, as well as other minor changes.
Checklist(Required)
-
Does it affect the original behavior: (Yes/No/I Don't know) No
-
Has unit tests been added: (Yes/No/No Need) No unit test is added, but tested manually. in spark-sql CLI.
-
Has document been added or modified: (Yes/No/No Need)
-
Does it need to update dependencies: (Yes/No)
-
Are there any changes that cannot be rolled back: (Yes/No)
Further comments
If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...
Test results:
Versions:
- spark-3.3.0-bin-hadoop3
- JDK 1.8
-
truncate Doris table from CLI
-
Create spark view and insert data into Doris table Start
spark-sql
CLI in local mode and execute:
CREATE TEMPORARY VIEW spark_doris
USING doris
OPTIONS(
"table.identifier"="zjc_1.table_hash",
"fenodes"="localhost:8030",
"user"="zjc",
"password"="******"
);
insert into spark_doris select 5,15.0;
-
Check data in Doris
-
Select data in spark-sql
select * from spark_doris;
How to build spark-doris-connector for Spark 3.3.0
Run the command:
sh build.sh --spark 3.3.0 --scala 2.12
Please modify the spark.minor.version name in the build.sh script
Hello, thank you for your contribution, can you resolve the conflict?
Hi, any further plans or progress on this PR?
And it seems not all the features listed are about introducing support to Spark 3.3 and they are good to be separated into several smaller PRs.