tutorials icon indicating copy to clipboard operation
tutorials copied to clipboard

make_distribute_tutorial_work_in_google_colab

Open venkatram-dev opened this issue 1 year ago • 2 comments

Fixes #ISSUE_NUMBER

https://github.com/pytorch/tutorials/issues/3003 https://github.com/pytorch/tutorials/issues/3009

Description

for issue 3003

mp.set_start_method("spawn") works in local (mac) But using that does not work well in google colab, since it has some restrictions. So added below code snippet to address both.

    if "google.colab" in sys.modules:
        print("Running in Google Colab")
        mp.get_context("spawn")
    else:
        mp.set_start_method("spawn")

Please note mp.set_start_method("fork") will also work in google colab. But it will work only if the code is run once. Upon rerunning, it will fail. mp.get_context("spawn") allows multiple reruns with our restarting the session.

Also added clarification for issue 3009

`-  reading from ``tensor`` after ``dist.irecv()`` will result in undefined behaviour,
until ``req.wait()`` has been executed.`

Since the change is in the same file (and is little), I am trying to address both issues together. I am happy to make 2 PRs if needed.

Checklist

  • [x] The issue that is being fixed is referred in the description (see above "Fixes #ISSUE_NUMBER")
  • [ ] Only one issue is addressed in this pull request
  • [ ] Labels from the issue that this PR is fixing are added to this pull request
  • [ ] No unnecessary issues are included into this pull request.

cc @wconstab @osalpekar @H-Huang @kwen2501

venkatram-dev avatar Aug 31 '24 22:08 venkatram-dev

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3022

Note: Links to docs will display an error until the docs builds have been completed.

:white_check_mark: No Failures

As of commit c7f5a9d9fbe2d4bea64067ca488921c20514ba1b with merge base 01eeee67407d938c246dbd027a135408881bcff3 (image): :green_heart: Looks good so far! There are no failures yet. :green_heart:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot[bot] avatar Aug 31 '24 22:08 pytorch-bot[bot]

@svekars , Please review.

Note : Since the change is in the same file (and is little), I am trying to address both issues together. I am happy to make 2 PRs if needed.

venkatram-dev avatar Aug 31 '24 23:08 venkatram-dev

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as stale.
Feel free to remove the stale label if you feel this was a mistake.
If you are unable to remove the stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
stale pull requests will automatically be closed after 30 days of inactivity.

github-actions[bot] avatar Dec 31 '24 00:12 github-actions[bot]

@venkatram-dev - do you want to resolve the conflicts and land this PR?

c-p-i-o avatar Jan 11 '25 02:01 c-p-i-o