Bo Yang

Results 48 comments of Bo Yang

Nice fix, thanks @YutingWang98! @mabansal @mayurdb , would you help to merge the PR?

It may depend on how these metrics are calculated. Remote shuffle service does write some extra data for each shuffle record like task attempt id and partition id to track...

Hi @cpd85 , this is due to a special case inside our previous environment, where when each server restarts, the behavior is unpredictable. To be safe, we do not want...

The unpredictable issue is mostly related to the internal environment at that time. Kind of hard to explain. It is better to redesign the server restart/recover feature, look forward to...

I have a fork, and make Remote Shuffle Sevice work on k8s. Also removed dependence on ZooKeeper. The fork is here: https://github.com/datapunchorg/RemoteShuffleService/tree/k8s-spark-3.1

You need to put the Remote Shuffle Service client jar file inside jars folder in Spark image. You could download Remote Shuffle Service client jar file from Maven: ``` org.datapunch...

Yes, it should work as well for "simply Copy the jar and build the image".

Previously RSS does not handle server restart well, thus adding those check. Feel we could remove it.

Would you try spark.shuffle.rss.replicas=2?