incubator-uniffle icon indicating copy to clipboard operation
incubator-uniffle copied to clipboard

[Bug] The deserialization Read of the Shuffle Read line failed

Open yl09099 opened this issue 7 months ago • 2 comments

Code of Conduct

Search before asking

  • [x] I have searched in the issues and found no similar issues.

Describe the bug

In Shuffle Read, deserialization throws java.io.EOFException: reached end of stream after reading 19336 bytes; 576,807,796 bytes expected exception. This issue does not occur frequently.

Image

java.io.EOFException: reached end of stream after reading 19336 bytes; 576807796 bytes expected at org.sparkproject.guava.io.ByteStreams.readFully(ByteStreams.java:735) at org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$2$$anon$3.next(UnsafeRowSerializer.scala:127) at org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$2$$anon$3.next(UnsafeRowSerializer.scala:110) at org.apache.spark.shuffle.reader.RssShuffleDataIterator.next(RssShuffleDataIterator.java:150) at org.apache.spark.shuffle.reader.RssShuffleReader$MultiPartitionIterator.next(RssShuffleReader.java:261) at org.apache.spark.shuffle.reader.RssShuffleReader$MultiPartitionIterator.next(RssShuffleReader.java:215) at org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:40) at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage23.sort_addToSorter_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage23.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage25.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$2.hasNext(WholeStageCodegenExec.scala:778) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.shuffle.writer.RssShuffleWriter.writeImpl(RssShuffleWriter.java:170) at org.apache.spark.shuffle.writer.RssShuffleWriter.write(RssShuffleWriter.java:157) at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1463) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Affects Version(s)

0.11.0

Uniffle Server Log Output


Uniffle Engine Log Output


Uniffle Server Configurations


Uniffle Engine Configurations


Additional context

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

yl09099 avatar Sep 24 '25 10:09 yl09099

What type of RPC is being used?

zuston avatar Sep 26 '25 02:09 zuston

What type of RPC is being used?

GRPC

yl09099 avatar Sep 26 '25 02:09 yl09099