elastic icon indicating copy to clipboard operation
elastic copied to clipboard

[examples/imagenet/main.py] Why doesn't elastic code contain gpu sync to compute performance, e.g. all_reduce

Open aitrics-chris opened this issue 3 years ago • 0 comments

❓ Questions and Help

Please note that this issue tracker is not a help form and this issue will be closed.

Before submitting, please ensure you have gone through our documentation. Here are some links that may be helpful:

  • What is torchelastic?
  • Quickstart on AWS
  • Usage
  • Examples
  • API documentation
    • Overview
    • Rendezvous documentation
    • Checkpointing documentation
  • Configuring

Question

Elastic code does NOT compute all_reduce in order to compute the performance. Please check the other ImageNet example. How does this Elastic method gather values from all gpus?

aitrics-chris avatar Aug 22 '22 08:08 aitrics-chris