[Bug] The inclusion of high-version Spark classes in paimon-spark-common may lead to certain exceptions.
Search before asking
- [x] I searched in the issues and found nothing similar.
Paimon version
1.0.0
Compute Engine
spark3.3
Minimal reproduce step
execute desc tableName or show create table command
What doesn't meet your expectations?
scala> spark.sql("show create table tableName").show(false)
java.lang.NoClassDefFoundError: [Lorg/apache/spark/sql/connector/catalog/Column;
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at org.apache.kyuubi.plugin.spark.authz.util.AuthZUtils$.invoke(AuthZUtils.scala:63)
at org.apache.kyuubi.plugin.spark.authz.util.AuthZUtils$.invokeAs(AuthZUtils.scala:77)
at org.apache.kyuubi.plugin.spark.authz.serde.TableExtractor$.getOwner(tableExtractors.scala:50)
at org.apache.kyuubi.plugin.spark.authz.serde.ResolvedTableTableExtractor.apply(tableExtractors.scala:103)
at org.apache.kyuubi.plugin.spark.authz.serde.ResolvedTableTableExtractor.apply(tableExtractors.scala:97)
at org.apache.kyuubi.plugin.spark.authz.serde.TableDesc.extract(Descriptor.scala:244)
at org.apache.kyuubi.plugin.spark.authz.PrivilegesBuilder$.getTablePriv$1(PrivilegesBuilder.scala:128)
at org.apache.kyuubi.plugin.spark.authz.PrivilegesBuilder$.$anonfun$buildCommand$7(PrivilegesBuilder.scala:174)
at scala.collection.immutable.List.foreach(List.scala:431)
at org.apache.kyuubi.plugin.spark.authz.PrivilegesBuilder$.buildCommand(PrivilegesBuilder.scala:172)
at org.apache.kyuubi.plugin.spark.authz.PrivilegesBuilder$.build(PrivilegesBuilder.scala:224)
at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization$.checkPrivileges(RuleAuthorization.scala:50)
at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.apply(RuleAuthorization.scala:36)
at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.apply(RuleAuthorization.scala:33)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211)
at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
at scala.collection.immutable.List.foldLeft(List.scala:91)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200)
at scala.collection.immutable.List.foreach(List.scala:431)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$optimizedPlan$1(QueryExecution.scala:126)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185)
at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184)
at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:122)
at org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:118)
at org.apache.spark.sql.execution.QueryExecution.assertOptimized(QueryExecution.scala:136)
at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:154)
at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:151)
at org.apache.spark.sql.execution.QueryExecution.simpleString(QueryExecution.scala:204)
at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$explainString(QueryExecution.scala:249)
at org.apache.spark.sql.execution.QueryExecution.explainString(QueryExecution.scala:218)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:103)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:94)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:584)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:176)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:584)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:560)
at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:94)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:81)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:79)
at org.apache.spark.sql.Dataset.
Anything else?
I'm using the Spark authentication plugin of Kyuubi here. I've found that when checking permissions, an exception occurs because the SparkTable class in Paimon carries high - version Spark classes. I compared the implementations of different versions. In versions 0.8 and earlier of Paimon, the SparkTable was a Java class. After version 0.9, it was rewritten in Scala. This has led to the SparkTable class importing classes that only exist in high - version Spark after compilation.
Are you willing to submit a PR?
- [ ] I'm willing to submit a PR!
We should keep using Java for implementation here. This should be able to avoid the import issues caused by Scala compilation when relying on the high - version Spark. Or is it possible to directly specify different Spark versions to generate different common JARs?
We should keep using Java for implementation here. This should be able to avoid the import issues caused by Scala compilation when relying on the high - version Spark. Or is it possible to directly specify different Spark versions to generate different common JARs?
Attempts to modify the POM to the corresponding Spark version resulted in compilation failures, as some features from a higher - version Spark were separately introduced.
@JingsongLi Could you spare some time to take a look at this issue? The current problem is preventing the upgrade to Paimon 1.0 version.
This problem was solved by adding an empty implementation of Column in paimon - spark3.3.
This problem was solved by adding an empty implementation of
Columninpaimon - spark3.3.
@thomasg19930417 use an empty implementation of Column,and the authentication plugin of Kyuubi work good?
@thomasg19930417 bro, did you figure this out? I’m facing the same issue and would really appreciate your solution.