Do we need to provide a more detailed document for HIVE ?
I tried to write few records into an iceberg table which is in a hive catalog by apache flink (document). And then I tried to query the table by apache hive ( document is here.
I found that there's only a section described how to query hadoop tables in HIVE. So what's the way to query a hive catalog tables in HIVE ? I think we should provide document for querying tables in hive catalogs ?
@massdosage
I tried to create external table (which was created in hive catalog by flink sql client) in hive sql client.
add jar iceberg-hive-runtime-apache-iceberg-0.9.0-rc4-238-g779dafd.dirty.jar;
set iceberg.mr.catalog=hive;
set hive.metastore.uris=thrift://....;
CREATE EXTERNAL TABLE iceberg_db_01.iceberg_001(
userid int,
f_random_str string
)
STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
LOCATION 'hdfs://nameservice1/user/hive/warehouse/iceberg_db_01.db/iceberg_001/';
The error message is here:
20/10/27 15:15:28 INFO ql.Driver: Compiling command(queryId=root_20201027151528_7ddc1a58-4c8a-44ef-9ac2-55ec510f2e7d): CREATE EXTERNAL TABLE iceberg_db_01.iceberg_001(
userid int,
f_random_str string
)
STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
LOCATION 'hdfs://nameservice1/user/hive/warehouse/iceberg_db_01.db/iceberg_001/'
20/10/27 15:15:28 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
20/10/27 15:15:28 INFO parse.SemanticAnalyzer: Creating table iceberg_db_01.iceberg_001 position=22
20/10/27 15:15:28 INFO ql.Driver: Semantic Analysis Completed
20/10/27 15:15:28 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
20/10/27 15:15:28 INFO ql.Driver: Completed compiling command(queryId=root_20201027151528_7ddc1a58-4c8a-44ef-9ac2-55ec510f2e7d); Time taken: 0.076 seconds
20/10/27 15:15:28 INFO ql.Driver: Executing command(queryId=root_20201027151528_7ddc1a58-4c8a-44ef-9ac2-55ec510f2e7d): CREATE EXTERNAL TABLE iceberg_db_01.iceberg_001(
userid int,
f_random_str string
)
STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
LOCATION 'hdfs://nameservice1/user/hive/warehouse/iceberg_db_01.db/iceberg_001/'
20/10/27 15:15:28 INFO ql.Driver: Starting task [Stage-0:DDL] in serial mode
20/10/27 15:15:28 INFO plan.CreateTableDesc: Use StorageHandler-supplied org.apache.iceberg.mr.hive.HiveIcebergSerDe for table iceberg_001
20/10/27 15:15:28 INFO exec.DDLTask: creating table iceberg_db_01.iceberg_001 on hdfs://nameservice1/user/hive/warehouse/iceberg_db_01.db/iceberg_001
20/10/27 15:15:28 INFO mr.Catalogs: Loaded Hive Metastore catalog HiveCatalog{name=hive, uri=thrift://...:9083}
20/10/27 15:15:28 INFO iceberg.BaseMetastoreTableOperations: Refreshing table metadata from new version: hdfs://nameservice1/user/hive/warehouse/iceberg_db_01.db/iceberg_001/metadata/00008-6e9a36e4-7dcb-4dfb-8b9c-74bcf5080e8a.metadata.json
20/10/27 15:15:28 INFO iceberg.BaseMetastoreCatalog: Table loaded by catalog: hive.iceberg_db_01.iceberg_001
20/10/27 15:15:28 INFO mr.Catalogs: Loaded Hive Metastore catalog HiveCatalog{name=hive, uri=thrift://...:9083}
20/10/27 15:15:28 INFO iceberg.BaseMetastoreTableOperations: Refreshing table metadata from new version: hdfs://nameservice1/user/hive/warehouse/iceberg_db_01.db/iceberg_001/metadata/00008-6e9a36e4-7dcb-4dfb-8b9c-74bcf5080e8a.metadata.json
20/10/27 15:15:28 INFO iceberg.BaseMetastoreCatalog: Table loaded by catalog: hive.iceberg_db_01.iceberg_001
20/10/27 15:15:28 ERROR exec.DDLTask: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException: Please provide a table schema
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:868)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:873)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4291)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2200)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1843)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1563)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1339)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1328)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:836)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:772)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:699)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:313)
at org.apache.hadoop.util.RunJar.main(RunJar.java:227)
Caused by: java.lang.NullPointerException: Please provide a table schema
at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897)
at org.apache.iceberg.mr.hive.HiveIcebergMetaHook.preCreateTable(HiveIcebergMetaHook.java:93)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:822)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:813)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
at com.sun.proxy.$Proxy34.createTable(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2562)
at com.sun.proxy.$Proxy34.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:858)
... 22 more
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException: Please provide a table schema
20/10/27 15:15:28 ERROR ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException: Please provide a table schema
20/10/27 15:15:28 INFO ql.Driver: Completed executing command(queryId=root_20201027151528_7ddc1a58-4c8a-44ef-9ac2-55ec510f2e7d); Time taken: 0.142 seconds
20/10/27 15:15:28 INFO conf.HiveConf: Using the default value passed in for log id: d1298959-7aa5-4ecb-883e-fc302e1714c3
20/10/27 15:15:28 INFO session.SessionState: Resetting thread name to main
Yes, we do need more documentation for Hive. I think this is something I'll work on after the 0.10.0 release candidate is out.
Yes, it's actually on my list to do this as the next addition to the Hive documentation. I have a set of integration tests that I'm writing that create tables using each of the possible methods (Hadoop Tables, Hive Catalog, Custom Catalog etc.) and then query them from Hive. Once I have a complete path working I've been documenting it as I'm then sure it works end to end. I've only managed to get HadoopTables done so far but just started on the Hive Catalog path. I'll work on this when I get the time and raise a PR when appropriate but of course if someone gets to this first I'm happy to review whatever they come up with.
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'