Junfan Zhang

Results 434 comments of Junfan Zhang

In the case of partition reassign, this will block the rpc response. It should be acted with thread pool.

when using the netty mode, the rpc layer's bytebuf is `io.netty.buffer.CompositeByteBuf` , when invoking with the `bytebuf.nioBuffer` , it will converted into the heap buf

> Hi [@zuston](https://github.com/zuston) > > Do we need to add a new configuration option for a retry logic with backoff and assign it to `GrpcClient`? > > https://github.com/apache/uniffle/blob/1e48bc673d1c0ee41f889a0de6192b0fab131467/common/src/main/java/org/apache/uniffle/common/config/RssClientConf.java Yes. This...

If we disable the multi replicas of shuffle-data, we should strict check the processed blockIds. WDYT? @jerqi

> > If we disable the multi replicas of shuffle-data, we should strict check the processed blockIds. WDYT? @jerqi > > Does it cover the speculation execution and AQE? I...

I have implemented in the riffle side. https://github.com/zuston/riffle/pull/532

cc @chaokunyang . If you have time, could you help review this integration with Fory? So far, this implementation hasn’t shown significant improvements. I would greatly appreciate any guidance you...

Big thanks for your quick and patient review. @chaokunyang > Shuffle data should already be binary, is there anything that needs being serialized? If using the vanilla spark, record is...

> Only if you are using spark rdd with raw java objects, there will be serialization bottleneck. Such cases are similiar to datastream in flink. We've observed several times of...

> Data record in Spark SQL are alreay binary, there is no serialization happened. I suggest benchmark first before optimizing. It seems that serialization is still happening. https://github.com/apache/spark/blob/2de0248071035aa94818386c2402169f6670d2d4/core/src/main/scala/org/apache/spark/shuffle/ShuffleWriteProcessor.scala#L57 The product2...