Junfan Zhang
Junfan Zhang
If not having this mechanism, how to handle the multiple checkers retry? In our internal env, we will use the multiple checkers, including health checker(need to retry) and customize checker(no...
> > If not having this mechanism, how to handle the multiple checkers retry? > > In our internal env, we will use the multiple checkers, including health checker(need to...
But when having multiple checkers, and the apps are in candidates list, for these apps, it need retry.
So do we need this feature ? cc @jerqi
OK. Close it.
+1. Now the decommission could be used by exclude node file in coordinator side. Besides, the exclude-node-file could be stored in HDFS.
> Yarn node's decommission. https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html Yes. In #85, I follow the rule of YARN decommission mechanism. So i think it's better to control the decommission by coordinator. Feel free to...
Glad to hear this. From the flame graph, due to extra memory-copy, it cost too much time in shuffle server side. If using the netty to directly manipulate shuffle data...
So do we have a roadmap of version-release/features in github? @jerqi
> Actually I think it brings extra complexity in order to avoid initializing a thread, it's a little worthless. OK.