hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[SUPPORT] spark-sql can't create hudi table

Open txl2017 opened this issue 1 year ago • 4 comments

Tips before filing an issue

  • Have you gone through our FAQs?

  • Join the mailing list to engage in conversations and get faster support at [email protected].

  • If you have triaged this as a bug, then file an issue directly.

Describe the problem you faced

A clear and concise description of the problem.

To Reproduce

Steps to reproduce the behavior:

1.spark-sql --jars /home/hudi-spark3.1-bundle_2.12-0.11.1.jar --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' 2.create table hudi_spark.hudi_cow_nonpcf_tb1 ( uuid int, name string, price double ) using hudi; 3.22/07/18 11:32:53 WARN config.DFSPropertiesConfiguration: Cannot find HUDI_CONF_DIR, please set it as the dir of hudi-defaults.conf 22/07/18 11:32:53 WARN config.DFSPropertiesConfiguration: Properties file file:/etc/hudi/conf/hudi-defaults.conf not found. Ignoring to load props file 22/07/18 11:32:55 WARN command.CreateHoodieTableCommand: Failed to create catalog table in metastore: org.apache.hudi.hadoop.HoodieParquetInputFormat Time taken: 2.437 seconds 4.show tables; no table created. the folder /user/hive/warehouse/hudi_spark.db/hudi_cow_nonpcf_tb1 created success

Expected behavior

A clear and concise description of what you expected to happen.

Environment Description

  • Hudi version : 0.11.1
  • Spark version : 3.1.2
  • Hive version : 2.1.1
  • Hadoop version : 3.0.0
  • Storage (HDFS/S3/GCS..) : HDFS
  • Running on Docker? (yes/no) : no Additional context

Add any other context about the problem here.

Stacktrace

Add the stacktrace of the error.

txl2017 avatar Jul 18 '22 03:07 txl2017

@txl2017 suspect sth to do with your hive version. can you try hive 2.3.x ?

xushiyan avatar Jul 19 '22 04:07 xushiyan

yes, ensure you use hive2.3+ and you have hive sync configs properly configured.

nsivabalan avatar Aug 28 '22 00:08 nsivabalan

@nsivabalan we are also facing same issue. Our use case is similar as mentioned in this ticket i.e a. create hudi table(it should get registered to hive metastore) b. insert data into hudi table using "insert into.." command.

I referred to below documentation but it doesn't help https://hudi.apache.org/docs/next/syncing_metastore#spark-datasource-example

please share the documentation which states how to enable hive sync when create table using hudi command is executed in either spark-shell or spark-sql Our component version is as below spark: 3.0.2 hive : 3.1.2 hudi : 0.10.1

ashokblend avatar Sep 20 '22 08:09 ashokblend

after adding hudi-hadoop-mr-bundle-0.10.1.jar to hiverserver lib, it works.

ashokblend avatar Sep 21 '22 04:09 ashokblend

after adding hudi-hadoop-mr-bundle-0.10.1.jar to hiverserver lib, it works.

Thanks for confirming. Yes hudi-hadoop-mr-bundle jar should be on hive.aux.jars.path

xushiyan avatar Oct 30 '22 05:10 xushiyan