Search before asking
- [x] I searched in the issues and found nothing similar.
Paimon version
0.9.0 client
Compute Engine
starrocks
Minimal reproduce step
may https://github.com/StarRocks/starrocks/issues/54070
What doesn't meet your expectations?
[26.794s][warning][gc,alloc] Thread-5: Retried waiting for GCLocker too often allocating 1048578 words
Exception in thread "Thread-5" java.lang.OutOfMemoryError: Java heap space
at java.base/java.nio.HeapByteBuffer.(HeapByteBuffer.java:64)
at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:363)
at org.apache.paimon.shade.org.apache.parquet.bytes.HeapByteBufferAllocator.allocate(HeapByteBufferAllocator.java:32)
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader$ConsecutivePartList.readAll(ParquetFileReader.java:1502)
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader.readAllPartsVectoredOrNormal(ParquetFileReader.java:553)
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader.internalReadRowGroup(ParquetFileReader.java:447)
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:396)
at org.apache.paimon.format.parquet.ParquetReaderFactory$ParquetReader.readNextRowGroup(ParquetReaderFactory.java:348)
at org.apache.paimon.format.parquet.ParquetReaderFactory$ParquetReader.nextBatch(ParquetReaderFactory.java:327)
at org.apache.paimon.format.parquet.ParquetReaderFactory$ParquetReader.readBatch(ParquetReaderFactory.java:309)
at org.apache.paimon.io.FileRecordReader.readBatch(FileRecordReader.java:47)
at org.apache.paimon.io.KeyValueDataFileRecordReader.readBatch(KeyValueDataFileRecordReader.java:48)
at org.apache.paimon.mergetree.compact.ConcatRecordReader.readBatch(ConcatRecordReader.java:66)
at org.apache.paimon.mergetree.compact.LoserTree$LeafIterator.advanceIfAvailable(LoserTree.java:315)
at org.apache.paimon.mergetree.compact.LoserTree.initializeIfNeeded(LoserTree.java:87)
at org.apache.paimon.mergetree.compact.SortMergeReaderWithLoserTree.readBatch(SortMergeReaderWithLoserTree.java:71)
at org.apache.paimon.mergetree.DropDeleteReader.readBatch(DropDeleteReader.java:44)
at org.apache.paimon.reader.RecordReader$1.readBatch(RecordReader.java:173)
at org.apache.paimon.table.source.KeyValueTableRead$1.readBatch(KeyValueTableRead.java:131)
at org.apache.paimon.reader.RecordReader$2.readBatch(RecordReader.java:194)
at org.apache.paimon.reader.RecordReaderIterator.(RecordReaderIterator.java:37)
at com.starrocks.paimon.reader.PaimonSplitScanner.initReader(PaimonSplitScanner.java:106)
at com.starrocks.paimon.reader.PaimonSplitScanner.open(PaimonSplitScanner.java:115)
Anything else?
No response
Are you willing to submit a PR?
- [ ] I'm willing to submit a PR!
Maybe the parquet version is too low, currently ParquetFileReader.java is using parquet 1.13.1, you can try to upgrade to parquet 1.15.1 to see.
I tried https://github.com/apache/paimon/pull/5421, but there are still problems.
[13.443s][warning][gc,alloc] Thread-24: Retried waiting for GCLocker too often allocating 1048578 words
Exception in thread "Thread-24" java.lang.OutOfMemoryError: Java heap space
at java.base/java.nio.HeapByteBuffer.(HeapByteBuffer.java:64)
at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:363)
at org.apache.paimon.shade.org.apache.parquet.bytes.HeapByteBufferAllocator.allocate(HeapByteBufferAllocator.java:34)
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader$ConsecutivePartList.readAll(ParquetFileReader.java:1550)
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader.readAllPartsVectoredOrNormal(ParquetFileReader.java:578)
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader.internalReadRowGroup(ParquetFileReader.java:471)
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:420)
at org.apache.paimon.format.parquet.ParquetReaderFactory$ParquetReader.readNextRowGroup(ParquetReaderFactory.java:348)
at org.apache.paimon.format.parquet.ParquetReaderFactory$ParquetReader.nextBatch(ParquetReaderFactory.java:327)
at org.apache.paimon.format.parquet.ParquetReaderFactory$ParquetReader.readBatch(ParquetReaderFactory.java:309)
at org.apache.paimon.io.FileRecordReader.readBatch(FileRecordReader.java:47)
at org.apache.paimon.io.KeyValueDataFileRecordReader.readBatch(KeyValueDataFileRecordReader.java:48)
at org.apache.paimon.mergetree.compact.ConcatRecordReader.readBatch(ConcatRecordReader.java:66)
at org.apache.paimon.mergetree.compact.LoserTree$LeafIterator.advanceIfAvailable(LoserTree.java:315)
at org.apache.paimon.mergetree.compact.LoserTree.initializeIfNeeded(LoserTree.java:87)
at org.apache.paimon.mergetree.compact.SortMergeReaderWithLoserTree.readBatch(SortMergeReaderWithLoserTree.java:71)
at org.apache.paimon.mergetree.DropDeleteReader.readBatch(DropDeleteReader.java:44)
at org.apache.paimon.reader.RecordReader$1.readBatch(RecordReader.java:173)
at org.apache.paimon.table.source.KeyValueTableRead$1.readBatch(KeyValueTableRead.java:131)
at org.apache.paimon.reader.RecordReader$2.readBatch(RecordReader.java:194)
at org.apache.paimon.reader.RecordReaderIterator.(RecordReaderIterator.java:37)
at com.starrocks.paimon.reader.PaimonSplitScanner.initReader(PaimonSplitScanner.java:106)
at com.starrocks.paimon.reader.PaimonSplitScanner.open(PaimonSplitScanner.java:115)
Exception in thread "Thread-47" java.lang.OutOfMemoryError: Java heap space
at java.base/java.nio.HeapByteBuffer.(HeapByteBuffer.java:64)
at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:363)
at org.apache.paimon.shade.org.apache.parquet.bytes.HeapByteBufferAllocator.allocate(HeapByteBufferAllocator.java:34)
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader$ConsecutivePartList.readAll(ParquetFileReader.java:1550)
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader.readAllPartsVectoredOrNormal(ParquetFileReader.java:578)
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader.internalReadRowGroup(ParquetFileReader.java:471)
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:420)
at org.apache.paimon.format.parquet.ParquetReaderFactory$ParquetReader.readNextRowGroup(ParquetReaderFactory.java:348)
at org.apache.paimon.format.parquet.ParquetReaderFactory$ParquetReader.nextBatch(ParquetReaderFactory.java:327)
at org.apache.paimon.format.parquet.ParquetReaderFactory$ParquetReader.readBatch(ParquetReaderFactory.java:309)
at org.apache.paimon.io.FileRecordReader.readBatch(FileRecordReader.java:47)
at org.apache.paimon.io.KeyValueDataFileRecordReader.readBatch(KeyValueDataFileRecordReader.java:48)
at org.apache.paimon.mergetree.compact.ConcatRecordReader.readBatch(ConcatRecordReader.java:66)
at org.apache.paimon.mergetree.compact.LoserTree$LeafIterator.advanceIfAvailable(LoserTree.java:315)
at org.apache.paimon.mergetree.compact.LoserTree.initializeIfNeeded(LoserTree.java:87)
at org.apache.paimon.mergetree.compact.SortMergeReaderWithLoserTree.readBatch(SortMergeReaderWithLoserTree.java:71)
at org.apache.paimon.mergetree.DropDeleteReader.readBatch(DropDeleteReader.java:44)
at org.apache.paimon.reader.RecordReader$1.readBatch(RecordReader.java:173)
at org.apache.paimon.table.source.KeyValueTableRead$1.readBatch(KeyValueTableRead.java:131)
at org.apache.paimon.reader.RecordReader$2.readBatch(RecordReader.java:194)
at org.apache.paimon.reader.RecordReaderIterator.(RecordReaderIterator.java:37)
at com.starrocks.paimon.reader.PaimonSplitScanner.initReader(PaimonSplitScanner.java:106)
at com.starrocks.paimon.reader.PaimonSplitScanner.open(PaimonSplitScanner.java:115)