[Bug] java.lang.NoSuchMethodError: org.apache.parquet.hadoop.ParquetFileReader.<init> when running BranchSqlITCase in IDEA
Search before asking
- [X] I searched in the issues and found nothing similar.
Paimon version
paimon-1.0-SNAPSHOT
Compute Engine
flink
Minimal reproduce step
Run BranchSqlITCase in IDEA.
What doesn't meet your expectations?
doop.ParquetFileReader.<init>(Lorg/apache/parquet/io/InputFile;Lorg/apache/parquet/ParquetReadOptions;Lorg/apache/paimon/fileindex/FileIndexResult;)V
at org.apache.paimon.format.parquet.ParquetUtil.getParquetReader(ParquetUtil.java:85)
at org.apache.paimon.format.parquet.ParquetUtil.extractColumnStats(ParquetUtil.java:52)
at org.apache.paimon.format.parquet.ParquetSimpleStatsExtractor.extractWithFileInfo(ParquetSimpleStatsExtractor.java:78)
at org.apache.paimon.format.parquet.ParquetSimpleStatsExtractor.extract(ParquetSimpleStatsExtractor.java:71)
at org.apache.paimon.io.StatsCollectingSingleFileWriter.fieldStats(StatsCollectingSingleFileWriter.java:88)
at org.apache.paimon.io.RowDataFileWriter.result(RowDataFileWriter.java:109)
at org.apache.paimon.io.RowDataFileWriter.result(RowDataFileWriter.java:48)
at org.apache.paimon.io.RollingFileWriter.closeCurrentWriter(RollingFileWriter.java:136)
at org.apache.paimon.io.RollingFileWriter.close(RollingFileWriter.java:168)
at org.apache.paimon.append.AppendOnlyWriter$DirectSinkWriter.flush(AppendOnlyWriter.java:418)
at org.apache.paimon.append.AppendOnlyWriter.flush(AppendOnlyWriter.java:219)
at org.apache.paimon.append.AppendOnlyWriter.prepareCommit(AppendOnlyWriter.java:207)
at org.apache.paimon.operation.AbstractFileStoreWrite.prepareCommit(AbstractFileStoreWrite.java:210)
at org.apache.paimon.operation.MemoryFileStoreWrite.prepareCommit(MemoryFileStoreWrite.java:154)
at org.apache.paimon.table.sink.TableWriteImpl.prepareCommit(TableWriteImpl.java:253)
at org.apache.paimon.flink.sink.StoreSinkWriteImpl.prepareCommit(StoreSinkWriteImpl.java:229)
at org.apache.paimon.flink.sink.TableWriteOperator.prepareCommit(TableWriteOperator.java:123)
at org.apache.paimon.flink.sink.RowDataStoreWriteOperator.prepareCommit(RowDataStoreWriteOperator.java:189)
at org.apache.paimon.flink.sink.PrepareCommitOperator.emitCommittables(PrepareCommitOperator.java:100)
at org.apache.paimon.flink.sink.PrepareCommitOperator.endInput(PrepareCommitOperator.java:88)
Anything else?
No response
Are you willing to submit a PR?
- [ ] I'm willing to submit a PR!
We have encountered many times such as NoSuchMethodError and ClassNotFoundException when executing tests in IDEA. This is mainly because we overwrote some format classes, shade packages, etc.
Should we create a separate paimon-shaded project for these shade packages and depend on the shade package in paimon? @JingsongLi @Zouxxyy WDYT?
Can't agree more, have been tortured by this for a long time
I pushed a fix for this in the first commit of #4520, may be this can be solved
This is mainly due to we override the parquet's class ParquetFileReader with our own version. If we have a paimon-shade project, we have to put paimon's ParquetFileReader in it ?
This is mainly due to we override the parquet's class
ParquetFileReaderwith our own version. If we have a paimon-shade project, we have to put paimon'sParquetFileReaderin it ?
Yes, we should put all the needed classes in it.
I also encountered it, before the paimon-shade, we can solve it in this way for the time being.
I also encountered it, before the paimon-shade, we can solve it in this way for the time being.
Run org.apache.paimon.spark.sql.DDLWithHiveCatalogTestBase also has similar error,I make paimon-format higher than parquet,it still has this error.
java.lang.BootstrapMethodError: java.lang.NoSuchMethodError: org.apache.parquet.hadoop.ParquetWriter$Builder.withBloomFilterFPP(Ljava/lang/String;D)Lorg/apache/parquet/hadoop/ParquetWriter$Builder;
at org.apache.paimon.format.parquet.writer.RowDataParquetBuilder.createWriter(RowDataParquetBuilder.java:95)
at org.apache.paimon.format.parquet.ParquetWriterFactory.create(ParquetWriterFactory.java:52)
at org.apache.paimon.io.SingleFileWriter.<init>(SingleFileWriter.java:74)
at org.apache.paimon.io.StatsCollectingSingleFileWriter.<init>(StatsCollectingSingleFileWriter.java:58)
at org.apache.paimon.io.RowDataFileWriter.<init>(RowDataFileWriter.java:70)
at org.apache.paimon.io.RowDataRollingFileWriter.lambda$new$0(RowDataRollingFileWriter.java:59)
at org.apache.paimon.io.RollingFileWriter.openCurrentWriter(RollingFileWriter.java:123)
at org.apache.paimon.io.RollingFileWriter.write(RollingFileWriter.java:78)
at org.apache.paimon.append.AppendOnlyWriter$DirectSinkWriter.write(AppendOnlyWriter.java:403)
at org.apache.paimon.append.AppendOnlyWriter.write(AppendOnlyWriter.java:161)
at org.apache.paimon.append.AppendOnlyWriter.write(AppendOnlyWriter.java:66)
at org.apache.paimon.operation.AbstractFileStoreWrite.write(AbstractFileStoreWrite.java:150)
at org.apache.paimon.table.sink.TableWriteImpl.writeAndReturn(TableWriteImpl.java:175)
at org.apache.paimon.table.sink.TableWriteImpl.write(TableWriteImpl.java:147)
at org.apache.paimon.spark.SparkTableWrite.write(SparkTableWrite.scala:40)
at org.apache.paimon.spark.commands.PaimonSparkWriter.$anonfun$write$2(PaimonSparkWriter.scala:94)
at org.apache.paimon.spark.commands.PaimonSparkWriter.$anonfun$write$2$adapted(PaimonSparkWriter.scala:94)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at org.apache.paimon.spark.commands.PaimonSparkWriter.$anonfun$write$1(PaimonSparkWriter.scala:94)
at org.apache.spark.sql.execution.MapPartitionsExec.$anonfun$doExecute$3(objects.scala:201)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
We have encountered many times such as
NoSuchMethodErrorandClassNotFoundExceptionwhen executing tests in IDEA. This is mainly because we overwrote some format classes, shade packages, etc.Should we create a separate
paimon-shadedproject for these shade packages and depend on the shade package inpaimon? @JingsongLi @Zouxxyy WDYT?
+1 for a dedicated repo to hold the shade format classes, we have encountered several times when running test in IDE
