hudi
hudi copied to clipboard
[SUPPORT] NoClassDefFoundError for org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFile
We occasionally hit the following exception when running a Flink writer job. The job won't self-heal, but can be recovered by manually restarting the TaskManager.
MDT was enabled.
java.lang.NoClassDefFoundError: Could not initialize class org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFile
at org.apache.hudi.io.storage.HoodieHFileUtils.createHFileReader(HoodieHFileUtils.java:56)
at org.apache.hudi.io.storage.HoodieAvroHFileReader.<init>(HoodieAvroHFileReader.java:101)
at org.apache.hudi.io.storage.HoodieAvroFileReaderFactory.newHFileFileReader(HoodieAvroFileReaderFactory.java:35)
at org.apache.hudi.io.storage.HoodieFileReaderFactory.getFileReader(HoodieFileReaderFactory.java:63)
at org.apache.hudi.metadata.HoodieBackedTableMetadata.getBaseFileReader(HoodieBackedTableMetadata.java:460)
at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:433)
at org.apache.hudi.metadata.HoodieBackedTableMetadata.getOrCreateReaders(HoodieBackedTableMetadata.java:425)
at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeys$3(HoodieBackedTableMetadata.java:239)
at java.base/java.util.HashMap.forEach(Unknown Source)
at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:237)
at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:152)
at org.apache.hudi.metadata.BaseTableMetadata.fetchAllFilesInPartition(BaseTableMetadata.java:339)
at org.apache.hudi.metadata.BaseTableMetadata.getAllFilesInPartition(BaseTableMetadata.java:150)
at org.apache.hudi.metadata.HoodieMetadataFileSystemView.listPartition(HoodieMetadataFileSystemView.java:69)
at org.apache.hudi.common.table.view.AbstractTableFileSystemView.lambda$ensurePartitionLoadedCorrectly$16(AbstractTableFileSystemView.java:428)
at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(Unknown Source)
at org.apache.hudi.common.table.view.AbstractTableFileSystemView.ensurePartitionLoadedCorrectly(AbstractTableFileSystemView.java:419)
at org.apache.hudi.common.table.view.AbstractTableFileSystemView.getLatestMergedFileSlicesBeforeOrOn(AbstractTableFileSystemView.java:854)
at org.apache.hudi.common.table.view.PriorityBasedFileSystemView.execute(PriorityBasedFileSystemView.java:104)
at org.apache.hudi.common.table.view.PriorityBasedFileSystemView.getLatestMergedFileSlicesBeforeOrOn(PriorityBasedFileSystemView.java:195)
at org.apache.hudi.sink.partitioner.profile.DeltaWriteProfile.smallFilesProfile(DeltaWriteProfile.java:62)
at org.apache.hudi.sink.partitioner.profile.WriteProfile.getSmallFiles(WriteProfile.java:191)
at org.apache.hudi.sink.partitioner.BucketAssigner.getSmallFileAssign(BucketAssigner.java:179)
at org.apache.hudi.sink.partitioner.BucketAssigner.addInsert(BucketAssigner.java:137)
at org.apache.hudi.sink.partitioner.BucketAssignFunction.getNewRecordLocation(BucketAssignFunction.java:215)
at org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:194)
at org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:162)
at org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233)
at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:542)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:831)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:780)
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:914)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
Environment Description
-
Hudi version : bdb50ddccc9631317dfb06a06abc38cbd3714ce8
-
Flink version : 1.16.1
-
Hadoop version : 3.3.4
-
Storage (HDFS/S3/GCS..) : S3