kyuubi
kyuubi copied to clipboard
[TASK][EASY] Fix compatibility with Apache Spark master branch
What's the level of this task?
EASY
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Search before creating
- [X] I have searched in the task list and found no similar tasks.
Mentor
- [X] I have sufficient expertise on this task, and I volunteer to be a mentor of this task to guide contributors through the task.
Skill requirements
- Knowledge on GitHub Actions, Maven, Spark
Background and Goals
Kyuubi has daily testing on the Spark master branch, but unfortunately, it has been failed for a while.
Implementation steps
Check and fix the daily test with the Apache Spark master branch https://github.com/apache/kyuubi/actions/workflows/nightly.yml
Additional context
Introduction of 2024H1 Kyuubi Code Contribution Program
Please assign this task to me, I will try to solve it.
@ShockleysxX thanks, assigned
Revoked as you already have been assigned another task.
According to the Log displayed in CI
run locally
./build/mvn clean install -Pscala-2.13 -Pspark-master -pl extensions/spark/kyuubi-spark-lineage -am -Dmaven.javadoc.skip=true -V -Dtest=none -DwildcardSuites=org.apache.kyuubi.plugin.lineage.helper.SparkSQLLineageParserHelperSuite
got the same result
Discovery starting.
Discovery completed in 810 milliseconds.
Run starting. Expected test count is: 33
SparkSQLLineageParserHelperSuite:
*** RUN ABORTED ***
java.lang.NoClassDefFoundError: jakarta/servlet/Servlet
at org.apache.spark.metrics.sink.MetricsServlet.getHandlers(MetricsServlet.scala:50)
at org.apache.spark.metrics.MetricsSystem.$anonfun$getServletHandlers$2(MetricsSystem.scala:91)
at scala.Option.map(Option.scala:242)
at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:686)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2937)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:1117)
at scala.Option.getOrElse(Option.scala:201)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:1111)
at org.apache.spark.sql.SparkListenerExtensionTest.spark(SparkListenerExtensionTest.scala:42)
...
Cause: java.lang.ClassNotFoundException: jakarta.servlet.Servlet
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
at org.apache.spark.metrics.sink.MetricsServlet.getHandlers(MetricsServlet.scala:50)
at org.apache.spark.metrics.MetricsSystem.$anonfun$getServletHandlers$2(MetricsSystem.scala:91)
at scala.Option.map(Option.scala:242)
at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:686)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2937)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:1117)
...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Kyuubi Project Parent 1.10.0-SNAPSHOT:
[INFO]
[INFO] Kyuubi Project Parent .............................. SUCCESS [ 5.595 s]
[INFO] Kyuubi Project Util ................................ SUCCESS [ 3.653 s]
[INFO] Kyuubi Project Util Scala .......................... SUCCESS [ 21.428 s]
[INFO] Kyuubi Project Common .............................. SUCCESS [ 50.013 s]
[INFO] Kyuubi Project Events .............................. SUCCESS [ 13.887 s]
[INFO] Kyuubi Dev Spark Lineage Extension ................. FAILURE [ 19.960 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
add <jakarta.servlet-api.version>5.0.0</jakarta.servlet-api.version>
to spark-master
profile (https://github.com/apache/spark/blob/b299b2bc06a91db630ab39b9c35663342931bb56/pom.xml#L147)
find new error:
Discovery starting.
Discovery completed in 883 milliseconds.
Run starting. Expected test count is: 33
SparkSQLLineageParserHelperSuite:
ANTLR Tool version 4.13.1 used for code generation does not match the current runtime version 4.9.3
ANTLR Runtime version 4.13.1 used for parser compilation does not match the current runtime version 4.9.3
*** RUN ABORTED ***
java.lang.ExceptionInInitializerError:
at org.apache.spark.sql.catalyst.parser.AbstractParser.parse(parsers.scala:58)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:55)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(AbstractSqlParser.scala:68)
at org.apache.spark.sql.SparkSession.$anonfun$sql$5(SparkSession.scala:701)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:138)
at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:700)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:918)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:699)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:730)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:761)
...
Cause: java.lang.UnsupportedOperationException: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize ATN with version 4 (expected 3).
at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:187)
at org.apache.spark.sql.catalyst.parser.SqlBaseLexer.<clinit>(SqlBaseLexer.java:2949)
at org.apache.spark.sql.catalyst.parser.AbstractParser.parse(parsers.scala:58)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:55)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(AbstractSqlParser.scala:68)
at org.apache.spark.sql.SparkSession.$anonfun$sql$5(SparkSession.scala:701)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:138)
at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:700)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:918)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:699)
...
Cause: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize ATN with version 4 (expected 3).
at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:187)
at org.apache.spark.sql.catalyst.parser.SqlBaseLexer.<clinit>(SqlBaseLexer.java:2949)
at org.apache.spark.sql.catalyst.parser.AbstractParser.parse(parsers.scala:58)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:55)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(AbstractSqlParser.scala:68)
at org.apache.spark.sql.SparkSession.$anonfun$sql$5(SparkSession.scala:701)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:138)
at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:700)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:918)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:699)
...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Kyuubi Project Parent 1.10.0-SNAPSHOT:
[INFO]
[INFO] Kyuubi Project Parent .............................. SUCCESS [ 5.822 s]
[INFO] Kyuubi Project Util ................................ SUCCESS [ 7.390 s]
[INFO] Kyuubi Project Util Scala .......................... SUCCESS [ 24.927 s]
[INFO] Kyuubi Project Common .............................. SUCCESS [01:30 min]
[INFO] Kyuubi Project Events .............................. SUCCESS [ 32.613 s]
[INFO] Kyuubi Dev Spark Lineage Extension ................. FAILURE [ 18.767 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
continue to add <antlr4.version>4.13.1</antlr4.version>
to spark-master
profile (https://github.com/apache/spark/blob/b299b2bc06a91db630ab39b9c35663342931bb56/pom.xml#L210)
build and test are ok in this module.
[INFO] Reactor Summary for Kyuubi Project Parent 1.10.0-SNAPSHOT:
[INFO]
[INFO] Kyuubi Project Parent .............................. SUCCESS [ 5.369 s]
[INFO] Kyuubi Project Util ................................ SUCCESS [ 3.648 s]
[INFO] Kyuubi Project Util Scala .......................... SUCCESS [ 14.716 s]
[INFO] Kyuubi Project Common .............................. SUCCESS [ 51.174 s]
[INFO] Kyuubi Project Events .............................. SUCCESS [ 14.505 s]
[INFO] Kyuubi Dev Spark Lineage Extension ................. SUCCESS [ 44.672 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
but ./build/mvn clean install -Pscala-2.13 -Pspark-master -pl externals/kyuubi-spark-sql-engine -am -Dmaven.javadoc.skip=true -V -Dtest=none -DskipTests
build fails
[INFO] compiling 59 Scala sources to /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/target/scala-2.13/classes ...
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecutePython.scala:27: object ws is not a member of package javax
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecutePython.scala:307: not found: value UriBuilder
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecutePython.scala:320: not found: value UriBuilder
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:80: type mismatch;
found : HttpServletRequest (in javax.servlet.http)
required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:135: type mismatch;
found : HttpServletRequest (in javax.servlet.http)
required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:258: type mismatch;
found : HttpServletRequest (in javax.servlet.http)
required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:341: type mismatch;
found : HttpServletRequest (in javax.servlet.http)
required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:421: type mismatch;
found : HttpServletRequest (in javax.servlet.http)
required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:510: type mismatch;
found : HttpServletRequest (in javax.servlet.http)
required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:547: type mismatch;
found : HttpServletRequest (in javax.servlet.http)
required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineSessionPage.scala:101: type mismatch;
found : HttpServletRequest (in javax.servlet.http)
required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineSessionPage.scala:234: type mismatch;
found : HttpServletRequest (in javax.servlet.http)
required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] 12 errors found
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Kyuubi Project Parent 1.10.0-SNAPSHOT:
[INFO]
[INFO] Kyuubi Project Parent .............................. SUCCESS [ 4.661 s]
[INFO] Kyuubi Project Util ................................ SUCCESS [ 3.557 s]
[INFO] Kyuubi Project Util Scala .......................... SUCCESS [ 21.494 s]
[INFO] Kyuubi Project Common .............................. SUCCESS [ 52.340 s]
[INFO] Kyuubi Project Embedded Zookeeper .................. SUCCESS [ 9.851 s]
[INFO] Kyuubi Project High Availability ................... SUCCESS [ 25.424 s]
[INFO] Kyuubi Project Events .............................. SUCCESS [ 14.723 s]
[INFO] Kyuubi Dev Spark Lineage Extension ................. SUCCESS [ 27.378 s]
[INFO] Kyuubi Project Hive JDBC Client .................... SUCCESS [ 13.132 s]
[INFO] Kyuubi Project Engine Spark SQL .................... FAILURE [ 8.426 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
seems javax
and jakarta
namespace conflicts in spark 4.x and kyuubi codebase now? @pan3793 any suggestions?
You may find solution from https://github.com/apache/spark/pull/45154
Hi, @pan3793, I have browsed through what was done in https://github.com/apache/spark/pull/45154, which mainly accomplished the replacement of Jakrata
to Javax
namespaces in Spark-master, and indeed could be the main cause of this CI error.
I tried to make some changes locally and found that there are currently some compilation issues mainly with the web related code under Kyuubi Project Engine Spark SQL
module.
In some implementations of the module, Kyuubi uses e.g. the javax.servlet.http.HttpServletRequest
class, which leads to incompatibility when compiling with the latest Spark, which uses web-related class under jakarta
namespace.
But after modifying it to jakarta
, other versions of Spark CI under Kyuubi reported the error again.
It seems that javax
and jakarta
are difficult to deal with in the current scenario to pass all CI because Kyuubi uses classes such as WebUIPage
/UIUtils
from Spark WebUI, which also import classes in the corresponding namespace?
We did a similar thing for Flink previously.
Suppose there are incompatible classes in different versions:
package old
class Old {
OldR method()
}
package new
class New {
NewR method()
}
Then we can introduce a Shim class to handle it.
package shim
class Shim {
R method() {
// dynamically bind the runtime class using reflection.
}
}
Thanks for your help, i'll take a look later~
@liuxiaocs7 do you have progress on this task? 4.0.0 preview version is on the way, I hope we can fix the compatibility before that.
@liuxiaocs7 do you have progress on this task? 4.0.0 preview version is on the way, I hope we can fix the compatibility before that.
Sorry for late, yes, in progress.
Revoked as no progress for a while, Spark 4.0.0-preview1 is on the way, I'm working on this.
Escalate task level to medium, I almost fixed it via the following PRs:
- #6392
- #6397
- #6398
- #6399
- #6404
- #6405
- #6413
- #6415
- #6416
- #6417
- #6424
- #6425
Daily test with the Apache Spark master branch is green, close as completed. https://github.com/apache/kyuubi/actions/workflows/nightly.yml