paimon
paimon copied to clipboard
[Bug] file format is orc use zstd occur: ZstdException: Data corruption detected
Search before asking
- [x] I searched in the issues and found nothing similar.
Paimon version
detail:
Caused by: com.github.luben.zstd.ZstdException: Data corruption detected
at com.github.luben.zstd.ZstdDecompressCtx.decompressByteArray(ZstdDecompressCtx.java:205)
at com.github.luben.zstd.Zstd.decompressByteArray(Zstd.java:439)
at org.apache.paimon.shade.org.apache.orc.impl.ZstdCodec.decompress(ZstdCodec.java:259)
at org.apache.paimon.shade.org.apache.orc.impl.InStream$CompressedStream.readHeader(InStream.java:521)
at org.apache.paimon.shade.org.apache.orc.impl.InStream$CompressedStream.ensureUncompressed(InStream.java:548)
at org.apache.paimon.shade.org.apache.orc.impl.InStream$CompressedStream.read(InStream.java:535)
at org.apache.paimon.shade.org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:2060)
at org.apache.paimon.shade.org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:2079)
at org.apache.paimon.shade.org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:2177)
at org.apache.paimon.shade.org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:2009)
at org.apache.paimon.shade.org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextVector(TreeReaderFactory.java:2633)
at org.apache.paimon.shade.org.apache.orc.impl.reader.tree.StructBatchReader.readBatchColumn(StructBatchReader.java:65)
at org.apache.paimon.shade.org.apache.orc.impl.reader.tree.StructBatchReader.nextBatchForLevel(StructBatchReader.java:100)
at org.apache.paimon.shade.org.apache.orc.impl.reader.tree.StructBatchReader.nextBatch(StructBatchReader.java:77)
at org.apache.paimon.shade.org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1579)
at org.apache.paimon.format.orc.OrcReaderFactory.nextBatch(OrcReaderFactory.java:322)
at org.apache.paimon.format.orc.OrcReaderFactory.access$100(OrcReaderFactory.java:66)
at org.apache.paimon.format.orc.OrcReaderFactory$OrcVectorizedReader.readBatch(OrcReaderFactory.java:235)
at org.apache.paimon.format.orc.OrcReaderFactory$OrcVectorizedReader.readBatch(OrcReaderFactory.java:217)
at org.apache.paimon.reader.RecordReaderIterator.<init>(RecordReaderIterator.java:37)
Compute Engine
Flink
Minimal reproduce step
Current we don't know how reproduce this problem. Read the file find the problem is in header.
What doesn't meet your expectations?
If anyone meet this problem could give more context.
Anything else?
No response
Are you willing to submit a PR?
- [ ] I'm willing to submit a PR!