shifu icon indicating copy to clipboard operation
shifu copied to clipboard

Tensorflow Straggler Mitigation by Speculative Execution

Open zhangpengshan opened this issue 5 years ago • 1 comments

Each iteration to do stats and check if any slow workers, check like STDDev and if any outlier worker could be run one as standby backup worker in backup pool.

zhangpengshan avatar May 31 '19 09:05 zhangpengshan

There is no need to do that. Backup has been implemented in TF. that means, each iteration only takes the fastest N workers and give up the slowest C Straggler.

Mrhs121 avatar Jul 03 '20 12:07 Mrhs121