iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

show table extended not supported for v2 table.

Open meyergin opened this issue 2 years ago • 3 comments

Apache Iceberg version

0.14.0 (latest release)

Query engine

Spark

Please describe the bug 🐞

Spark + Hive Metastore When I query show table extended in SCHEMA_NAME like '*'; at Spark-SQL, it throws error:

Error in query: SHOW TABLE EXTENDED is not supported for v2 tables.;
    ShowTableExtended *, [namespace#906, tableName#907, isTemporary#908, information#909]
    +- ResolvedNamespace org.apache.iceberg.spark.SparkCatalog@49ea646b, [SCHEMA_NAME]
    
    	at org.apache.spark.sql.errors.QueryCompilationErrors$.commandUnsupportedInV2TableError(QueryCompilationErrors.scala:1507)
    	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:162)
    	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:101)
    	at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:367)
    	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:101)
    	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:96)
    	at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:187)
    	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:210)
    	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
    	at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:207)
    	at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:76)
    	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
    	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185)
    	at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
    	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185)
    	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
    	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184)
    	at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:76)
    	at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74)
    	at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:66)
    	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
    	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
    	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
    	at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:622)
    	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
    	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617)
    	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:651)
    	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:291)
    	... 16 more

meyergin avatar Sep 17 '22 01:09 meyergin

Getting the same error here using spark-sql:

spark-sql> show table extended in ice.snapshots like '*';
Error in query: SHOW TABLE EXTENDED is not supported for v2 tables.;
ShowTableExtended *, [namespace#179, tableName#180, isTemporary#181, information#182]
+- ResolvedNamespace org.apache.iceberg.spark.SparkCatalog@272d0dd3, [snapshots]

Environment: Spark 3.3.0 org.apache.iceberg:iceberg-aws:0.14.0, org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:0.14.0, org.apache.hadoop:hadoop-aws:3.3.3, software.amazon.awssdk:bundle:2.17.131, software.amazon.awssdk:url-connection-client:2.17.131, software.amazon.awssdk:kms:2.17.131

Command:

spark-sql \
    --packages  org.apache.iceberg:iceberg-aws:0.14.0,org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:0.14.0,org.apache.hadoop:hadoop-aws:3.3.3,software.amazon.awssdk:bundle:2.17.131,software.amazon.awssdk:url-connection-client:2.17.131,software.amazon.awssdk:kms:2.17.131 \
    --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
    --conf spark.sql.catalog.ice=org.apache.iceberg.spark.SparkCatalog \
    --conf spark.sql.catalog.ice.catalog-impl=org.apache.iceberg.aws.glue.GlueCatalog \
    --conf spark.sql.catalog.ice.warehouse=$WAREHOUSE_BUCKET_LOC \
    --conf spark.sql.catalog.ice.io-impl=org.apache.iceberg.aws.s3.S3FileIO \
    --conf iceberg.engine.hive.enabled=true

ja-michel avatar Sep 21 '22 12:09 ja-michel

The error seems to be thrown here so I wonder if the conversion (as per the comment in the code) should be done somewhere in iceberg. Maybe the catalog?

ja-michel avatar Sep 21 '22 12:09 ja-michel

Looking at the full stack trace, iceberg not involved?

scala> lastException.printStackTrace
org.apache.spark.sql.AnalysisException: SHOW TABLE EXTENDED is not supported for v2 tables.;
ShowTableExtended *, [namespace#0, tableName#1, isTemporary#2, information#3]
+- ResolvedNamespace org.apache.iceberg.spark.SparkCatalog@d28a805, [snapshots]

        at org.apache.spark.sql.errors.QueryCompilationErrors$.commandUnsupportedInV2TableError(QueryCompilationErrors.scala:1507)
        at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:162)
        at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:101)
        at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:367)
        at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:101)
        at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:96)
        at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:187)
        at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:210)
        at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
        at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:207)
        at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:76)
        at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
        at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185)
        at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
        at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
        at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184)
        at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:76)
        at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74)
        at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:66)
        at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
        at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
        at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:622)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
        at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617)
        at $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:23)
        at $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:27)
        at $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:29)
        at $line16.$read$$iw$$iw$$iw$$iw$$iw.<init>(<console>:31)
        at $line16.$read$$iw$$iw$$iw$$iw.<init>(<console>:33)
        at $line16.$read$$iw$$iw$$iw.<init>(<console>:35)
        at $line16.$read$$iw$$iw.<init>(<console>:37)
        at $line16.$read$$iw.<init>(<console>:39)
        at $line16.$read.<init>(<console>:41)
        at $line16.$read$.<init>(<console>:45)
        at $line16.$read$.<clinit>(<console>)
        at $line16.$eval$.$print$lzycompute(<console>:7)
        at $line16.$eval$.$print(<console>:6)
        at $line16.$eval.$print(<console>)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:747)
        at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1020)
        at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:568)
        at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:36)
        at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:116)
        at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41)
        at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:567)
        at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:594)
        at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:564)
        at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:865)
        at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:733)
        at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:435)
        at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:456)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:239)
        at org.apache.spark.repl.Main$.doMain(Main.scala:78)
        at org.apache.spark.repl.Main$.main(Main.scala:58)
        at org.apache.spark.repl.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

ja-michel avatar Sep 21 '22 13:09 ja-michel

We are working on support for Iceberg in dbt-spark. Since Iceberg does not support show tables extended we fall back to show table and many describe table to determine if a given table is an iceberg table or not.

Normally (for Hudi and Delta) dbt-spark uses the show tables extended and parses the information from the information column to determine if it's dealing with a Hudi or Delta table.

Iterating the tables and running describe table can get quite slow when there are hundreds of tables in a schema.

It would be much better if Iceberg also supported show tables extended

Tagging @Fokko who is also working on this

cccs-jc avatar Nov 28 '22 14:11 cccs-jc

Thanks for the background @cccs-jc. To add to that, this is the original issue in Spark, and a PR is ready: https://github.com/apache/spark/pull/37588. It is not directly related to Iceberg.

Fokko avatar Nov 28 '22 14:11 Fokko

Ha I see. Thanks for looking into this

cccs-jc avatar Nov 28 '22 19:11 cccs-jc

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] avatar Jun 29 '23 00:06 github-actions[bot]

It looks like there is some activity again on the Spark side: https://github.com/apache/spark/pull/37588#issuecomment-1612349461

Fokko avatar Jun 29 '23 06:06 Fokko

@Fokko this is nice to hear. Thanks for letting me know.

cccs-jc avatar Jun 29 '23 12:06 cccs-jc

Not sure if the error's stem is the same, but I'm facing the same behaviour when using AWS Glue as metastore

lsabreu96 avatar Nov 27 '23 20:11 lsabreu96

Not sure if the error's stem is the same, but I'm facing the same behaviour when using AWS Glue as metastore

Ya, I faced the same issue too. Anyone knows workaround?

tanweipeng avatar Dec 05 '23 03:12 tanweipeng

+1

Peeyush-Now avatar Dec 14 '23 09:12 Peeyush-Now