kyuubi icon indicating copy to clipboard operation
kyuubi copied to clipboard

[TASK][EASY] Fix compatibility with Apache Spark master branch

Open pan3793 opened this issue 10 months ago • 10 comments

What's the level of this task?

EASY

Code of Conduct

Search before creating

  • [X] I have searched in the task list and found no similar tasks.

Mentor

  • [X] I have sufficient expertise on this task, and I volunteer to be a mentor of this task to guide contributors through the task.

Skill requirements

  • Knowledge on GitHub Actions, Maven, Spark

Background and Goals

Kyuubi has daily testing on the Spark master branch, but unfortunately, it has been failed for a while.

Implementation steps

Check and fix the daily test with the Apache Spark master branch https://github.com/apache/kyuubi/actions/workflows/nightly.yml

Additional context

Introduction of 2024H1 Kyuubi Code Contribution Program

pan3793 avatar Apr 03 '24 11:04 pan3793

Please assign this task to me, I will try to solve it.

ShockleysxX avatar Apr 03 '24 15:04 ShockleysxX

@ShockleysxX thanks, assigned

pan3793 avatar Apr 03 '24 15:04 pan3793

Revoked as you already have been assigned another task.

pan3793 avatar Apr 07 '24 08:04 pan3793

According to the Log displayed in CI

image

run locally

./build/mvn clean install -Pscala-2.13 -Pspark-master -pl extensions/spark/kyuubi-spark-lineage -am -Dmaven.javadoc.skip=true -V -Dtest=none -DwildcardSuites=org.apache.kyuubi.plugin.lineage.helper.SparkSQLLineageParserHelperSuite

got the same result

Discovery starting.
Discovery completed in 810 milliseconds.
Run starting. Expected test count is: 33
SparkSQLLineageParserHelperSuite:
*** RUN ABORTED ***
  java.lang.NoClassDefFoundError: jakarta/servlet/Servlet
  at org.apache.spark.metrics.sink.MetricsServlet.getHandlers(MetricsServlet.scala:50)
  at org.apache.spark.metrics.MetricsSystem.$anonfun$getServletHandlers$2(MetricsSystem.scala:91)
  at scala.Option.map(Option.scala:242)
  at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91)
  at org.apache.spark.SparkContext.<init>(SparkContext.scala:686)
  at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2937)
  at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:1117)
  at scala.Option.getOrElse(Option.scala:201)
  at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:1111)
  at org.apache.spark.sql.SparkListenerExtensionTest.spark(SparkListenerExtensionTest.scala:42)
  ...
  Cause: java.lang.ClassNotFoundException: jakarta.servlet.Servlet
  at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
  at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
  at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
  at org.apache.spark.metrics.sink.MetricsServlet.getHandlers(MetricsServlet.scala:50)
  at org.apache.spark.metrics.MetricsSystem.$anonfun$getServletHandlers$2(MetricsSystem.scala:91)
  at scala.Option.map(Option.scala:242)
  at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91)
  at org.apache.spark.SparkContext.<init>(SparkContext.scala:686)
  at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2937)
  at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:1117)
  ...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Kyuubi Project Parent 1.10.0-SNAPSHOT:
[INFO]
[INFO] Kyuubi Project Parent .............................. SUCCESS [  5.595 s]
[INFO] Kyuubi Project Util ................................ SUCCESS [  3.653 s]
[INFO] Kyuubi Project Util Scala .......................... SUCCESS [ 21.428 s]
[INFO] Kyuubi Project Common .............................. SUCCESS [ 50.013 s]
[INFO] Kyuubi Project Events .............................. SUCCESS [ 13.887 s]
[INFO] Kyuubi Dev Spark Lineage Extension ................. FAILURE [ 19.960 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------

add <jakarta.servlet-api.version>5.0.0</jakarta.servlet-api.version> to spark-master profile (https://github.com/apache/spark/blob/b299b2bc06a91db630ab39b9c35663342931bb56/pom.xml#L147)

find new error:

Discovery starting.
Discovery completed in 883 milliseconds.
Run starting. Expected test count is: 33
SparkSQLLineageParserHelperSuite:
ANTLR Tool version 4.13.1 used for code generation does not match the current runtime version 4.9.3
ANTLR Runtime version 4.13.1 used for parser compilation does not match the current runtime version 4.9.3
*** RUN ABORTED ***
  java.lang.ExceptionInInitializerError:
  at org.apache.spark.sql.catalyst.parser.AbstractParser.parse(parsers.scala:58)
  at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:55)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(AbstractSqlParser.scala:68)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$5(SparkSession.scala:701)
  at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:138)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:700)
  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:918)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:699)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:730)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:761)
  ...
  Cause: java.lang.UnsupportedOperationException: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize ATN with version 4 (expected 3).
  at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:187)
  at org.apache.spark.sql.catalyst.parser.SqlBaseLexer.<clinit>(SqlBaseLexer.java:2949)
  at org.apache.spark.sql.catalyst.parser.AbstractParser.parse(parsers.scala:58)
  at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:55)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(AbstractSqlParser.scala:68)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$5(SparkSession.scala:701)
  at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:138)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:700)
  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:918)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:699)
  ...
  Cause: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize ATN with version 4 (expected 3).
  at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:187)
  at org.apache.spark.sql.catalyst.parser.SqlBaseLexer.<clinit>(SqlBaseLexer.java:2949)
  at org.apache.spark.sql.catalyst.parser.AbstractParser.parse(parsers.scala:58)
  at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:55)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(AbstractSqlParser.scala:68)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$5(SparkSession.scala:701)
  at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:138)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:700)
  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:918)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:699)
  ...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Kyuubi Project Parent 1.10.0-SNAPSHOT:
[INFO]
[INFO] Kyuubi Project Parent .............................. SUCCESS [  5.822 s]
[INFO] Kyuubi Project Util ................................ SUCCESS [  7.390 s]
[INFO] Kyuubi Project Util Scala .......................... SUCCESS [ 24.927 s]
[INFO] Kyuubi Project Common .............................. SUCCESS [01:30 min]
[INFO] Kyuubi Project Events .............................. SUCCESS [ 32.613 s]
[INFO] Kyuubi Dev Spark Lineage Extension ................. FAILURE [ 18.767 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------

continue to add <antlr4.version>4.13.1</antlr4.version> to spark-master profile (https://github.com/apache/spark/blob/b299b2bc06a91db630ab39b9c35663342931bb56/pom.xml#L210)

build and test are ok in this module.

[INFO] Reactor Summary for Kyuubi Project Parent 1.10.0-SNAPSHOT:
[INFO]
[INFO] Kyuubi Project Parent .............................. SUCCESS [  5.369 s]
[INFO] Kyuubi Project Util ................................ SUCCESS [  3.648 s]
[INFO] Kyuubi Project Util Scala .......................... SUCCESS [ 14.716 s]
[INFO] Kyuubi Project Common .............................. SUCCESS [ 51.174 s]
[INFO] Kyuubi Project Events .............................. SUCCESS [ 14.505 s]
[INFO] Kyuubi Dev Spark Lineage Extension ................. SUCCESS [ 44.672 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------

but ./build/mvn clean install -Pscala-2.13 -Pspark-master -pl externals/kyuubi-spark-sql-engine -am -Dmaven.javadoc.skip=true -V -Dtest=none -DskipTests build fails

[INFO] compiling 59 Scala sources to /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/target/scala-2.13/classes ...
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecutePython.scala:27: object ws is not a member of package javax
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecutePython.scala:307: not found: value UriBuilder
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecutePython.scala:320: not found: value UriBuilder
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:80: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:135: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:258: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:341: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:421: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:510: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:547: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineSessionPage.scala:101: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineSessionPage.scala:234: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] 12 errors found
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Kyuubi Project Parent 1.10.0-SNAPSHOT:
[INFO]
[INFO] Kyuubi Project Parent .............................. SUCCESS [  4.661 s]
[INFO] Kyuubi Project Util ................................ SUCCESS [  3.557 s]
[INFO] Kyuubi Project Util Scala .......................... SUCCESS [ 21.494 s]
[INFO] Kyuubi Project Common .............................. SUCCESS [ 52.340 s]
[INFO] Kyuubi Project Embedded Zookeeper .................. SUCCESS [  9.851 s]
[INFO] Kyuubi Project High Availability ................... SUCCESS [ 25.424 s]
[INFO] Kyuubi Project Events .............................. SUCCESS [ 14.723 s]
[INFO] Kyuubi Dev Spark Lineage Extension ................. SUCCESS [ 27.378 s]
[INFO] Kyuubi Project Hive JDBC Client .................... SUCCESS [ 13.132 s]
[INFO] Kyuubi Project Engine Spark SQL .................... FAILURE [  8.426 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE

seems javax and jakarta namespace conflicts in spark 4.x and kyuubi codebase now? @pan3793 any suggestions?

liuxiaocs7 avatar Apr 08 '24 03:04 liuxiaocs7

You may find solution from https://github.com/apache/spark/pull/45154

pan3793 avatar Apr 08 '24 03:04 pan3793

Hi, @pan3793, I have browsed through what was done in https://github.com/apache/spark/pull/45154, which mainly accomplished the replacement of Jakrata to Javax namespaces in Spark-master, and indeed could be the main cause of this CI error.

I tried to make some changes locally and found that there are currently some compilation issues mainly with the web related code under Kyuubi Project Engine Spark SQL module.

In some implementations of the module, Kyuubi uses e.g. the javax.servlet.http.HttpServletRequest class, which leads to incompatibility when compiling with the latest Spark, which uses web-related class under jakarta namespace.

But after modifying it to jakarta, other versions of Spark CI under Kyuubi reported the error again.

It seems that javax and jakarta are difficult to deal with in the current scenario to pass all CI because Kyuubi uses classes such as WebUIPage/UIUtils from Spark WebUI, which also import classes in the corresponding namespace?

liuxiaocs7 avatar Apr 09 '24 11:04 liuxiaocs7

We did a similar thing for Flink previously.

Suppose there are incompatible classes in different versions:

package old

class Old {
  OldR method()
}
package new

class New {
  NewR method()
}

Then we can introduce a Shim class to handle it.

package shim

class Shim {
  R method() {
     // dynamically bind the runtime class using reflection.
  }
}

pan3793 avatar Apr 09 '24 11:04 pan3793

Thanks for your help, i'll take a look later~

liuxiaocs7 avatar Apr 09 '24 11:04 liuxiaocs7

@liuxiaocs7 do you have progress on this task? 4.0.0 preview version is on the way, I hope we can fix the compatibility before that.

pan3793 avatar Apr 17 '24 09:04 pan3793

@liuxiaocs7 do you have progress on this task? 4.0.0 preview version is on the way, I hope we can fix the compatibility before that.

Sorry for late, yes, in progress.

liuxiaocs7 avatar Apr 18 '24 01:04 liuxiaocs7

Revoked as no progress for a while, Spark 4.0.0-preview1 is on the way, I'm working on this.

pan3793 avatar May 20 '24 13:05 pan3793

Escalate task level to medium, I almost fixed it via the following PRs:

  • #6392
  • #6397
  • #6398
  • #6399
  • #6404
  • #6405
  • #6413
  • #6415
  • #6416
  • #6417
  • #6424
  • #6425

pan3793 avatar May 21 '24 14:05 pan3793

Daily test with the Apache Spark master branch is green, close as completed. https://github.com/apache/kyuubi/actions/workflows/nightly.yml

image

pan3793 avatar May 30 '24 04:05 pan3793