stylable icon indicating copy to clipboard operation
stylable copied to clipboard

support for fault tolerance and straggler mitigation

Open youshaox opened this issue 2 years ago • 0 comments

Hi i have noticed that there is a plan for Fault-tolerance and straggler mitigation support in the future plan section. So how is the progress going right now?

Also, there is related paper from your team said that they have made the implementation based on BytePS. "Elastic Parameter Server Load Distribution in Deep Learning Clusters"

youshaox avatar Jul 04 '22 08:07 youshaox