ARROW-17338: [Java] The maximum request memory of BaseVariableWidthVector should limit to Integer.MAX_VALUE
We got a IndexOutOfBoundsException:
2022-08-03 09:33:34,076 Error executing query, currentState RUNNING, java.lang.RuntimeException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3315 in stage 5.0 failed 4 times, most recent failure: Lost task 3315.3 in stage 5.0 (TID 3926) (30.97.116.209 executor 49): java.lang.IndexOutOfBoundsException: index: 2147312542, length: 777713 (expected: range(0, 2147483648))
at org.apache.iceberg.shaded.org.apache.arrow.memory.ArrowBuf.checkIndex(ArrowBuf.java:699)
at org.apache.iceberg.shaded.org.apache.arrow.memory.ArrowBuf.setBytes(ArrowBuf.java:826)
at org.apache.iceberg.arrow.vectorized.parquet.VectorizedParquetDefinitionLevelReader$VarWidthReader.nextVal(VectorizedParquetDefinitionLevelReader.java:418)
at org.apache.iceberg.arrow.vectorized.parquet.VectorizedParquetDefinitionLevelReader$BaseReader.nextBatch(VectorizedParquetDefinitionLevelReader.java:235)
at org.apache.iceberg.arrow.vectorized.parquet.VectorizedPageIterator$VarWidthTypePageReader.nextVal(VectorizedPageIterator.java:353)
at org.apache.iceberg.arrow.vectorized.parquet.VectorizedPageIterator$BagePageReader.nextBatch(VectorizedPageIterator.java:161)
at org.apache.iceberg.arrow.vectorized.parquet.VectorizedColumnIterator$VarWidthTypeBatchReader.nextBatchOf(VectorizedColumnIterator.java:191)
at org.apache.iceberg.arrow.vectorized.parquet.VectorizedColumnIterator$BatchReader.nextBatch(VectorizedColumnIterator.java:74)
at org.apache.iceberg.arrow.vectorized.VectorizedArrowReader.read(VectorizedArrowReader.java:158)
at org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader.read(ColumnarBatchReader.java:51)
at org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader.read(ColumnarBatchReader.java:35)
at org.apache.iceberg.parquet.VectorizedParquetReader$FileIterator.next(VectorizedParquetReader.java:134)
at org.apache.iceberg.spark.source.BaseDataReader.next(BaseDataReader.java:98)
at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:79)
at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:112)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
The root cause is the following code of BaseVariableWidthVector.handleSafe could fail to reallocate because of int overflow and then led to IndexOutOfBoundsException when we put the data into the vector.
protected final void handleSafe(int index, int dataLength) {
while (index >= getValueCapacity()) {
reallocValidityAndOffsetBuffers();
}
final int startOffset = lastSet < 0 ? 0 : getStartOffset(lastSet + 1);
// startOffset + dataLength could overflow
while (valueBuffer.capacity() < (startOffset + dataLength)) {
reallocDataBuffer();
}
}
The offset width of BaseVariableWidthVector is 4, while the maximum memory allocation is Long.MAX_VALUE. This makes the memory allocation check invalid.
https://issues.apache.org/jira/browse/ARROW-17338
:warning: Ticket has not been started in JIRA, please click 'Start Progress'.
Hi @pitrou, could you help to review this when you are free? Thanks a lot.
Also cc @lwhite1 for review / opinions.
Is this modification consistent with the goals of https://issues.apache.org/jira/browse/ARROW-6112?
Is this modification consistent with the goals of https://issues.apache.org/jira/browse/ARROW-6112?
I think so. Regular binary/string types have 32-bit signed offsets so cannot handle more than a 2GB data buffer by construction.
It seems large string/binary types use a separate BaseLargeVariableWidthVector base class AFAICT (but feel free to check as I'm not a Java contributor).
Thanks, @pitrou @lwhite1 for your time to review this. The UT has been added. Please take another look when you are free. Thanks again.
Sorry for the delay @ConeyLiu . Given the several +1's I'm gonna merge if/when CI is green.
Benchmark runs are scheduled for baseline = f0688d01c465417e6f3515f9344154ad6f47ba22 and contender = 4fa40072a776662fdb48e99e7671784b1ee12545. 4fa40072a776662fdb48e99e7671784b1ee12545 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished :arrow_down:0.0% :arrow_up:0.0%] ec2-t3-xlarge-us-east-2
[Failed :arrow_down:1.4% :arrow_up:0.24%] test-mac-arm
[Failed :arrow_down:1.37% :arrow_up:0.27%] ursa-i9-9960x
[Finished :arrow_down:1.32% :arrow_up:0.36%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 4fa40072 ec2-t3-xlarge-us-east-2
[Failed] 4fa40072 test-mac-arm
[Failed] 4fa40072 ursa-i9-9960x
[Finished] 4fa40072 ursa-thinkcentre-m75q
[Finished] f0688d01 ec2-t3-xlarge-us-east-2
[Failed] f0688d01 test-mac-arm
[Failed] f0688d01 ursa-i9-9960x
[Finished] f0688d01 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
['Python', 'R'] benchmarks have high level of regressions. ursa-i9-9960x
Thanks, @pitrou for merging this. And also thanks to everyone for the time to review this.
While working on https://github.com/apache/spark/pull/39572 to support the large variable width vectors in Spark, I think I found that this PR effectively limits these regular width variable vectors to 1 GiB total. While it was definitely a bug how things were handled before this PR, now whenever you try to add data beyond 1 GiB, the vector will try to double itself to the next power of two, which would be 2147483648, which is greater than Integer.MAX_VALUE which is 2147483647, thus throwing a OversizedAllocationException.