[Bug] Failed to finish checkpoint due to the expection: 'Failed to read 4 bytes'
Search before asking
- [x] I searched in the issues and found nothing similar.
Paimon version
Paimon Version: 1.0.1 Storage: OSS
Compute Engine
Flink 1.20.1
Minimal reproduce step
Some of my flinksql tasks throw out 'Failed to read 4 bytes' during doing checkpoint after running for several hours.The task is to read from some paimon tables and write to a paimon table of partial-update engine. The source code of the task: ` SET 'parallelism.default' = '2'; SET 'table.exec.sink.not-null-enforcer'='DROP';
CREATE TABLE if not exists masked_schema.masked_table ( masked_field_1 STRING, masked_field_2 STRING, masked_field_3 STRING, masked_field_4 STRING, masked_field_5 STRING, masked_field_6 STRING, masked_field_7 STRING, masked_field_8 STRING, masked_field_9 TIMESTAMP, masked_field_10 ARRAY<ROW<masked_subfield_1 STRING,masked_subfield_2 STRING,masked_subfield_3 STRING>>, PRIMARY KEY (masked_field_1, masked_field_2) NOT ENFORCED ) WITH ( 'bucket' = '10', 'merge-engine' = 'partial-update', 'changelog-producer' = 'full-compaction', 'full-compaction.delta-commits'='3', 'snapshot.time-retained' = '24h', 'fields.masked_field_10.aggregate-function' = 'nested_update', 'fields.masked_field_10.nested-key' = 'masked_subfield_1,masked_subfield_2,masked_subfield_3', 'fields.masked_field_9.sequence-group' = 'masked_field_10' );
INSERT INTO masked_schema.masked_table
SELECT
masked_field_1,
masked_field_2,
masked_field_3,
masked_field_4,
masked_field_5,
masked_field_6,
masked_field_7,
masked_field_8,
CAST(NULL AS TIMESTAMP) AS masked_field_9,
CAST(NULL AS ARRAY<ROW<masked_subfield_1 STRING,masked_subfield_2 STRING,masked_subfield_3 STRING>>) AS masked_field_10
FROM
masked_schema.masked_source_table_1 /*+ OPTIONS('scan.infer-parallelism' = 'false', 'consumer-id' = 'masked_table', 'consumer.expiration-time' = '2d', 'consumer.mode' = 'exactly-once') /
WHERE
masked_flag ='0'
UNION ALL
SELECT
masked_field_1,
masked_field_2,
CAST(NULL AS STRING) AS masked_field_3,
CAST(NULL AS STRING) AS masked_field_4,
CAST(NULL AS STRING) AS masked_field_5,
CAST(NULL AS STRING) AS masked_field_6,
CAST(NULL AS STRING) AS masked_field_7,
CAST(NULL AS STRING) AS masked_field_8,
IFNULL(IFNULL(masked_update_time, masked_create_time), NOW()) AS masked_field_9,
ARRAY[ROW(masked_subfield_1,masked_subfield_2,masked_subfield_3)] AS masked_field_10
FROM
masked_schema.masked_source_table_2 /+ OPTIONS('scan.infer-parallelism' = 'false', 'consumer-id' = 'masked_table', 'consumer.expiration-time' = '2d', 'consumer.mode' = 'exactly-once') */
WHERE
masked_subfield_1 NOT IN ('-','','0') AND masked_field_1 NOT IN ('-','','0');
`
What doesn't meet your expectations?
`
2025-04-09 17:32:53
java.io.IOException: Could not perform checkpoint 323 for operator Writer : tmp_table_name (2/2)#156.
at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1394)
at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:147)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.triggerCheckpoint(SingleCheckpointBarrierHandler.java:287)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.access$100(SingleCheckpointBarrierHandler.java:64)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler$ControllerImpl.triggerGlobalCheckpoint(SingleCheckpointBarrierHandler.java:488)
at org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.triggerGlobalCheckpoint(AbstractAlignedBarrierHandlerState.java:74)
at org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.barrierReceived(AbstractAlignedBarrierHandlerState.java:66)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.lambda$processBarrier$2(SingleCheckpointBarrierHandler.java:234)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.markCheckpointAlignedAndTransformState(SingleCheckpointBarrierHandler.java:262)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.processBarrier(SingleCheckpointBarrierHandler.java:231)
at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:181)
at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:159)
at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:122)
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:638)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:973)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:917)
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:970)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:949)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:763)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.paimon.shade.org.apache.parquet.io.ParquetDecodingException: Failed to read 4 bytes
at org.apache.paimon.flink.sink.StoreSinkWriteImpl.prepareCommit(StoreSinkWriteImpl.java:234)
at org.apache.paimon.flink.sink.GlobalFullCompactionSinkWrite.prepareCommit(GlobalFullCompactionSinkWrite.java:184)
at org.apache.paimon.flink.sink.TableWriteOperator.prepareCommit(TableWriteOperator.java:127)
at org.apache.paimon.flink.sink.RowDataStoreWriteOperator.prepareCommit(RowDataStoreWriteOperator.java:198)
at org.apache.paimon.flink.sink.PrepareCommitOperator.emitCommittables(PrepareCommitOperator.java:104)
at org.apache.paimon.flink.sink.PrepareCommitOperator.prepareSnapshotPreBarrier(PrepareCommitOperator.java:84)
at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.prepareSnapshotPreBarrier(RegularOperatorChain.java:89)
at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointState(SubtaskCheckpointCoordinatorImpl.java:332)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$18(StreamTask.java:1437)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:1425)
at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1382)
... 22 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.paimon.shade.org.apache.parquet.io.ParquetDecodingException: Failed to read 4 bytes
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.paimon.compact.CompactFutureManager.obtainCompactResult(CompactFutureManager.java:67)
at org.apache.paimon.compact.CompactFutureManager.innerGetCompactionResult(CompactFutureManager.java:53)
at org.apache.paimon.mergetree.compact.MergeTreeCompactManager.getCompactionResult(MergeTreeCompactManager.java:223)
at org.apache.paimon.mergetree.MergeTreeWriter.trySyncLatestCompaction(MergeTreeWriter.java:328)
at org.apache.paimon.mergetree.MergeTreeWriter.prepareCommit(MergeTreeWriter.java:276)
at org.apache.paimon.operation.AbstractFileStoreWrite.prepareCommit(AbstractFileStoreWrite.java:218)
at org.apache.paimon.operation.MemoryFileStoreWrite.prepareCommit(MemoryFileStoreWrite.java:155)
at org.apache.paimon.table.sink.TableWriteImpl.prepareCommit(TableWriteImpl.java:253)
at org.apache.paimon.flink.sink.StoreSinkWriteImpl.prepareCommit(StoreSinkWriteImpl.java:229)
... 33 more
Caused by: java.lang.RuntimeException: org.apache.paimon.shade.org.apache.parquet.io.ParquetDecodingException: Failed to read 4 bytes
at org.apache.paimon.reader.RecordReaderIterator.
`
Anything else?
No response
Are you willing to submit a PR?
- [ ] I'm willing to submit a PR!
we also met this problem, and suppose the compact can cause some files are corrupted, Is there a solution?
We also met the same problem. paimon version : 1.0 flink version: 17.1
org.apache.flink.runtime.JobException: Recovery is suppressed by FailureRateRestartBackoffTimeStrategy(FailureRateRestartBackoffTimeStrategy(failuresIntervalMS=3600000,backoffTimeMS=10000,maxFailuresPerInterval=3)
at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:139)
at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:83)
at org.apache.flink.runtime.scheduler.DefaultScheduler.recordTaskFailure(DefaultScheduler.java:258)
at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:249)
at org.apache.flink.runtime.scheduler.DefaultScheduler.onTaskFailed(DefaultScheduler.java:242)
at org.apache.flink.runtime.scheduler.SchedulerBase.onTaskExecutionStateUpdate(SchedulerBase.java:748)
at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:725)
at org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:80)
at org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:480)
at sun.reflect.GeneratedMethodAccessor89.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRpcInvocation$1(AkkaRpcActor.java:309)
at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:307)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:222)
at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:84)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:168)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20)
at scala.PartialFunction.applyOrElse(PartialFunction.scala:127)
at scala.PartialFunction.applyOrElse$(PartialFunction.scala:126)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:175)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
at akka.actor.Actor.aroundReceive(Actor.scala:537)
at akka.actor.Actor.aroundReceive$(Actor.scala:535)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:579)
at akka.actor.ActorCell.invoke(ActorCell.scala:547)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
at akka.dispatch.Mailbox.run(Mailbox.scala:231)
at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
Caused by: java.io.IOException: Could not perform checkpoint 11 for operator Writer : cdm_intltrn_ord_gds_order_detail_rt (1/2)#3.
at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1256)
at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:147)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.triggerCheckpoint(SingleCheckpointBarrierHandler.java:287)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.access$100(SingleCheckpointBarrierHandler.java:64)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler$ControllerImpl.triggerGlobalCheckpoint(SingleCheckpointBarrierHandler.java:488)
at org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.triggerGlobalCheckpoint(AbstractAlignedBarrierHandlerState.java:74)
at org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.barrierReceived(AbstractAlignedBarrierHandlerState.java:66)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.lambda$processBarrier$2(SingleCheckpointBarrierHandler.java:234)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.markCheckpointAlignedAndTransformState(SingleCheckpointBarrierHandler.java:262)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.processBarrier(SingleCheckpointBarrierHandler.java:231)
at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:181)
at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:159)
at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:118)
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:550)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788)
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:931)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.util.concurrent.ExecutionException: org.apache.paimon.shade.org.apache.parquet.io.ParquetDecodingException: Failed to read 4352 bytes
at org.apache.paimon.flink.sink.StoreSinkWriteImpl.prepareCommit(StoreSinkWriteImpl.java:234)
at org.apache.paimon.flink.sink.TableWriteOperator.prepareCommit(TableWriteOperator.java:127)
at org.apache.paimon.flink.sink.PrepareCommitOperator.emitCommittables(PrepareCommitOperator.java:104)
at org.apache.paimon.flink.sink.PrepareCommitOperator.prepareSnapshotPreBarrier(PrepareCommitOperator.java:84)
at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.prepareSnapshotPreBarrier(RegularOperatorChain.java:89)
at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointState(SubtaskCheckpointCoordinatorImpl.java:321)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$14(StreamTask.java:1299)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:1287)
at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1244)
... 22 more
Caused by: java.util.concurrent.ExecutionException: org.apache.paimon.shade.org.apache.parquet.io.ParquetDecodingException: Failed to read 4352 bytes
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.paimon.compact.CompactFutureManager.obtainCompactResult(CompactFutureManager.java:67)
at org.apache.paimon.compact.CompactFutureManager.innerGetCompactionResult(CompactFutureManager.java:53)
at org.apache.paimon.mergetree.compact.MergeTreeCompactManager.getCompactionResult(MergeTreeCompactManager.java:223)
at org.apache.paimon.mergetree.MergeTreeWriter.trySyncLatestCompaction(MergeTreeWriter.java:328)
at org.apache.paimon.mergetree.MergeTreeWriter.flushWriteBuffer(MergeTreeWriter.java:258)
at org.apache.paimon.mergetree.MergeTreeWriter.prepareCommit(MergeTreeWriter.java:264)
at org.apache.paimon.operation.AbstractFileStoreWrite.prepareCommit(AbstractFileStoreWrite.java:218)
at org.apache.paimon.operation.MemoryFileStoreWrite.prepareCommit(MemoryFileStoreWrite.java:155)
at org.apache.paimon.table.sink.TableWriteImpl.prepareCommit(TableWriteImpl.java:253)
at org.apache.paimon.flink.sink.StoreSinkWriteImpl.prepareCommit(StoreSinkWriteImpl.java:229)
... 31 more
Caused by: org.apache.paimon.shade.org.apache.parquet.io.ParquetDecodingException: Failed to read 4352 bytes
at org.apache.paimon.format.parquet.reader.AbstractColumnReader.readDataBuffer(AbstractColumnReader.java:361)
at org.apache.paimon.format.parquet.reader.LongColumnReader.readLongs(LongColumnReader.java:121)
at org.apache.paimon.format.parquet.reader.LongColumnReader.readBatch(LongColumnReader.java:51)
at org.apache.paimon.format.parquet.reader.LongColumnReader.readBatch(LongColumnReader.java:32)
at org.apache.paimon.format.parquet.reader.AbstractColumnReader.readToVector(AbstractColumnReader.java:229)
at org.apache.paimon.format.parquet.ParquetReaderFactory$ParquetReader.nextBatch(ParquetReaderFactory.java:411)
at org.apache.paimon.format.parquet.ParquetReaderFactory$ParquetReader.readBatch(ParquetReaderFactory.java:380)
at org.apache.paimon.format.parquet.ParquetReaderFactory$ParquetReader.readBatch(ParquetReaderFactory.java:319)
at org.apache.paimon.io.DataFileRecordReader.readBatch(DataFileRecordReader.java:66)
at org.apache.paimon.io.KeyValueDataFileRecordReader.readBatch(KeyValueDataFileRecordReader.java:50)
at org.apache.paimon.io.KeyValueDataFileRecordReader.readBatch(KeyValueDataFileRecordReader.java:34)
at org.apache.paimon.mergetree.compact.LoserTree$LeafIterator.advanceIfAvailable(LoserTree.java:315)
at org.apache.paimon.mergetree.compact.LoserTree.adjustForNextLoop(LoserTree.java:98)
at org.apache.paimon.mergetree.compact.SortMergeReaderWithLoserTree$SortMergeIterator.next(SortMergeReaderWithLoserTree.java:89)
at org.apache.paimon.reader.RecordReaderIterator.advanceIfNeeded(RecordReaderIterator.java:76)
at org.apache.paimon.reader.RecordReaderIterator.hasNext(RecordReaderIterator.java:55)
at org.apache.paimon.mergetree.compact.ChangelogMergeTreeRewriter.rewriteOrProduceChangelog(ChangelogMergeTreeRewriter.java:143)
at org.apache.paimon.mergetree.compact.ChangelogMergeTreeRewriter.rewrite(ChangelogMergeTreeRewriter.java:106)
at org.apache.paimon.mergetree.compact.MergeTreeCompactTask.rewriteImpl(MergeTreeCompactTask.java:157)
at org.apache.paimon.mergetree.compact.MergeTreeCompactTask.rewrite(MergeTreeCompactTask.java:152)
at org.apache.paimon.mergetree.compact.MergeTreeCompactTask.doCompact(MergeTreeCompactTask.java:105)
at org.apache.paimon.compact.CompactTask.call(CompactTask.java:49)
at org.apache.paimon.compact.CompactTask.call(CompactTask.java:34)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more
Caused by: java.io.EOFException
at org.apache.paimon.shade.org.apache.parquet.bytes.SingleBufferInputStream.slice(SingleBufferInputStream.java:116)
at org.apache.paimon.format.parquet.reader.AbstractColumnReader.readDataBuffer(AbstractColumnReader.java:359)
... 28 more
In order to avoid this issue, we have to change the file format of the table into 'orc', and it works.
maybe that's because of null value? we found that the file with null values can cause ParquetDecodingException, when filtering null value it's fine
+1 . I'm having the same problem
In order to avoid this issue, we have to change the file format of the table into 'orc', and it works.
Switching to the ORC format resolved the error. Thanks a lot!
Why can this issue be solved by switching to the ORC format?