paimon [Bug] The inclusion of high-version Spark classes in paimon-spark-common may lead to certain exceptions.

Search before asking

[x] I searched in the issues and found nothing similar.

Paimon version

1.0.0

Compute Engine

spark3.3

Minimal reproduce step

execute desc tableName or show create table command

What doesn't meet your expectations?

scala> spark.sql("show create table tableName").show(false) java.lang.NoClassDefFoundError: [Lorg/apache/spark/sql/connector/catalog/Column; at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) at java.lang.Class.privateGetMethodRecursive(Class.java:3048) at java.lang.Class.getMethod0(Class.java:3018) at java.lang.Class.getMethod(Class.java:1784) at org.apache.kyuubi.plugin.spark.authz.util.AuthZUtils$.invoke(AuthZUtils.scala:63) at org.apache.kyuubi.plugin.spark.authz.util.AuthZUtils$.invokeAs(AuthZUtils.scala:77) at org.apache.kyuubi.plugin.spark.authz.serde.TableExtractor$.getOwner(tableExtractors.scala:50) at org.apache.kyuubi.plugin.spark.authz.serde.ResolvedTableTableExtractor.apply(tableExtractors.scala:103) at org.apache.kyuubi.plugin.spark.authz.serde.ResolvedTableTableExtractor.apply(tableExtractors.scala:97) at org.apache.kyuubi.plugin.spark.authz.serde.TableDesc.extract(Descriptor.scala:244) at org.apache.kyuubi.plugin.spark.authz.PrivilegesBuilder$.getTablePriv$1(PrivilegesBuilder.scala:128) at org.apache.kyuubi.plugin.spark.authz.PrivilegesBuilder$.$anonfun$buildCommand$7(PrivilegesBuilder.scala:174) at scala.collection.immutable.List.foreach(List.scala:431) at org.apache.kyuubi.plugin.spark.authz.PrivilegesBuilder$.buildCommand(PrivilegesBuilder.scala:172) at org.apache.kyuubi.plugin.spark.authz.PrivilegesBuilder$.build(PrivilegesBuilder.scala:224) at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization$.checkPrivileges(RuleAuthorization.scala:50) at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.apply(RuleAuthorization.scala:36) at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.apply(RuleAuthorization.scala:33) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211) at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) at scala.collection.immutable.List.foldLeft(List.scala:91) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200) at scala.collection.immutable.List.foreach(List.scala:431) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88) at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179) at org.apache.spark.sql.execution.QueryExecution.$anonfun$optimizedPlan$1(QueryExecution.scala:126) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185) at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184) at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:122) at org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:118) at org.apache.spark.sql.execution.QueryExecution.assertOptimized(QueryExecution.scala:136) at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:154) at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:151) at org.apache.spark.sql.execution.QueryExecution.simpleString(QueryExecution.scala:204) at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$explainString(QueryExecution.scala:249) at org.apache.spark.sql.execution.QueryExecution.explainString(QueryExecution.scala:218) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:103) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:94) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:584) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:176) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:584) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:560) at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:94) at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:81) at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:79) at org.apache.spark.sql.Dataset.(Dataset.scala:220) at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97) at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:622) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617) ... 47 elided Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.connector.catalog.Column at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 118 more

Anything else?

I'm using the Spark authentication plugin of Kyuubi here. I've found that when checking permissions, an exception occurs because the SparkTable class in Paimon carries high - version Spark classes. I compared the implementations of different versions. In versions 0.8 and earlier of Paimon, the SparkTable was a Java class. After version 0.9, it was rewritten in Scala. This has led to the SparkTable class importing classes that only exist in high - version Spark after compilation.

Are you willing to submit a PR?

[ ] I'm willing to submit a PR!

Feb 08 '25 01:02 thomasg19930417

We should keep using Java for implementation here. This should be able to avoid the import issues caused by Scala compilation when relying on the high - version Spark. Or is it possible to directly specify different Spark versions to generate different common JARs?

Feb 10 '25 03:02 thomasg19930417

We should keep using Java for implementation here. This should be able to avoid the import issues caused by Scala compilation when relying on the high - version Spark. Or is it possible to directly specify different Spark versions to generate different common JARs?

Attempts to modify the POM to the corresponding Spark version resulted in compilation failures, as some features from a higher - version Spark were separately introduced.

Feb 10 '25 08:02 thomasg19930417

@JingsongLi Could you spare some time to take a look at this issue? The current problem is preventing the upgrade to Paimon 1.0 version.

Feb 10 '25 09:02 thomasg19930417

This problem was solved by adding an empty implementation of Column in paimon - spark3.3.

Feb 11 '25 06:02 thomasg19930417

This problem was solved by adding an empty implementation of Column in paimon - spark3.3.

@thomasg19930417 use an empty implementation of Column,and the authentication plugin of Kyuubi work good？

Jul 14 '25 06:07 dyp12

@thomasg19930417 bro, did you figure this out? I’m facing the same issue and would really appreciate your solution.

Sep 25 '25 03:09 ElancerBlack