hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[SUPPORT]Caused by: org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list of partition from metadata

Open bigdata-spec opened this issue 2 years ago • 2 comments

Environment Description Hudi version : 0.11.0 Flink version : 1.13.1 Hive version : 2.1.1-cdh6.2.0 Hadoop version : 3.0.0-cdh6.2.0 Storage (HDFS/S3/GCS..) : HDFS Running on Docker? (yes/no) : no

when I use execute Spark SQL: SET TBLPROPERTIES ("hoodie.metadata.enable"="false") INSERT OVERWRITE TABLE zone_dw.dws_stat_day_di PARTITION(dt) select * from zone_dw.dws_map_refresh_hi where dt BETWEEN '20220812' AND '20220819' zone_dw.dws_stat_day_di is Hive Table and zone_dw.dws_map_refresh_hi is Hudi COW table

the error log :

22/08/20 02:56:58 ERROR client.RemoteDriver: Failed to run client job 39d720db-b15d-4823-b8b1-54398b143d6e org.apache.hudi.exception.HoodieException: Error fetching partition paths from metadata table at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:315) at org.apache.hudi.BaseHoodieTableFileIndex.getAllQueryPartitionPaths(BaseHoodieTableFileIndex.java:176) at org.apache.hudi.BaseHoodieTableFileIndex.loadPartitionPathFiles(BaseHoodieTableFileIndex.java:219) at org.apache.hudi.BaseHoodieTableFileIndex.doRefresh(BaseHoodieTableFileIndex.java:264) at org.apache.hudi.BaseHoodieTableFileIndex.(BaseHoodieTableFileIndex.java:139) at org.apache.hudi.hadoop.HiveHoodieTableFileIndex.(HiveHoodieTableFileIndex.java:49) at org.apache.hudi.hadoop.HoodieCopyOnWriteTableInputFormat.listStatusForSnapshotMode(HoodieCopyOnWriteTableInputFormat.java:234) at org.apache.hudi.hadoop.HoodieCopyOnWriteTableInputFormat.listStatus(HoodieCopyOnWriteTableInputFormat.java:141) at org.apache.hudi.hadoop.HoodieParquetInputFormatBase.listStatus(HoodieParquetInputFormatBase.java:90) at org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat$HoodieCombineFileInputFormatShim.listStatus(HoodieCombineHiveInputFormat.java:889) at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:217) at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:76) at org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat$HoodieCombineFileInputFormatShim.getSplits(HoodieCombineHiveInputFormat.java:942) at org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat.getCombineSplits(HoodieCombineHiveInputFormat.java:241) at org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat.getSplits(HoodieCombineHiveInputFormat.java:363) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:205) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:251) at org.apache.spark.rdd.RDD.getNumPartitions(RDD.scala:267) at org.apache.spark.api.java.JavaRDDLike$class.getNumPartitions(JavaRDDLike.scala:65) at org.apache.spark.api.java.AbstractJavaRDDLike.getNumPartitions(JavaRDDLike.scala:45) at org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateMapInput(SparkPlanGenerator.java:252) at org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:179) at org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:130) at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:355) at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:400) at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:365) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list of partition from metadata at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:113) at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:313) ... 32 more Caused by: java.util.NoSuchElementException: No value present in Option at org.apache.hudi.common.util.Option.get(Option.java:89) at org.apache.hudi.metadata.HoodieTableMetadataUtil.getPartitionFileSlices(HoodieTableMetadataUtil.java:1057) at org.apache.hudi.metadata.HoodieTableMetadataUtil.getPartitionLatestMergedFileSlices(HoodieTableMetadataUtil.java:1001) at org.apache.hudi.metadata.HoodieBackedTableMetadata.getPartitionFileSliceToKeysMapping(HoodieBackedTableMetadata.java:377) at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:204) at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:140) at org.apache.hudi.metadata.BaseTableMetadata.fetchAllPartitionPaths(BaseTableMetadata.java:281) at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:111) ... 33 more 22/08/20 02:56:59 INFO client.RemoteDriver: Shutting down Spark Remote Driver. 22/08/20 02:56:59 INFO server.AbstractConnector: Stopped Spark@ce7a81b{HTTP/1.1,[http/1.1]}{0.0.0.0:0} 22/08/20 02:56:59 INFO ui.SparkUI: Stopped Spark web UI at http://scsp04097:34219 22/08/20 02:56:59 INFO yarn.YarnAllocator: Driver requested a total number of 0 executor(s). 22/08/20 02:56:59 INFO cluster.YarnClusterSchedulerBackend: Shutting down all executors 22/08/20 02:56:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down 22/08/20 02:56:59 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices (serviceOption=None, services=List(), started=false) 22/08/20 02:56:59 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 22/08/20 02:56:59 INFO memory.MemoryStore: MemoryStore cleared

But when I run the sql again, it work well;

bigdata-spec avatar Aug 22 '22 01:08 bigdata-spec

So, if I understand correctly, when you disable metadata table, you could read it w/o any issues. but if you enable metadata table, you are seeing above exception. is that right ?

can you share the contents of .hoodie. may be some data in metadata table is corrupt. I am not sure, but wanna inspect and see whats happening.

nsivabalan avatar Aug 27 '22 19:08 nsivabalan

@jiangbiao910 : can you respond to above request please.

nsivabalan avatar Sep 20 '22 23:09 nsivabalan

Looking at the code, this is what I can infer. here is the code block of interest

    if (mergeFileSlices) {
      fileSliceStream = fsView.getLatestMergedFileSlicesBeforeOrOn(
          partition, metaClient.getActiveTimeline().filterCompletedInstants().lastInstant().get().getTimestamp());
    } else {
      .
    }

lastInstant() is an option and looks like we are doing a get() on that. only reason this may not return anything is, if table does not have any commits only. i.e. metadata just got initialized, but before first commit completes in metadata table, a query was triggered and hence this error. I assume the issue is not persistant. it should not happen after sometime.

nsivabalan avatar Sep 29 '22 01:09 nsivabalan

I can put in a fix to mitigate the no completed instant issue w/ metadata table.

nsivabalan avatar Sep 29 '22 01:09 nsivabalan

https://github.com/apache/hudi/pull/6836

nsivabalan avatar Sep 30 '22 03:09 nsivabalan

closing github issue as we have a fix. thanks for reporting.

nsivabalan avatar Sep 30 '22 03:09 nsivabalan

@nsivabalan Thank you for your reply, if I don't set hoodie.metadata.enable"="false",throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()" if I set hoodie.metadata.enable"="false", often not every time throw “ Caused by: org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list of partition from metadata” But when I run the sql again, it work well. I think Hbase relies on Hadoop 2.10.0,But Our environment is CDH-6.3.2 and hadoop version is 3.0。

bigdata-spec avatar Sep 30 '22 05:09 bigdata-spec