Do not init process group if already initialized

Open Yard1 opened this issue 2 years ago • 3 comments

What does this PR do?

Simple tweak to skip initialization of the torch process group if one is already initialized.

Before submitting

[ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[x] Did you read the contributor guideline, Pull Request section?
[ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
[ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
[ ] Did you write any new necessary tests?

Who can review?

@OlivierDehaene

May 31 '23 22:05 Yard1

In what setting is this called twice? In my oppinion this should crash.

Jun 01 '23 07:06 OlivierDehaene

If the model is ran in a process which already has the process group initialized by another library (eg. deepspeed, Ray), I don't think it should crash.

If you want to throw an exception by default, then perhaps this behavior could be put behind an env var, or made overridable in the model classes themselves?

From an architectural perspective, it doesn't seem right for the process group to be initialized in the model itself. The more common pattern is to initialize it separately (see eg. torchrun)

Jun 01 '23 18:06 Yard1

@OlivierDehaene can you let me know if we can proceed here or if you think this PR should be rejected? It would be helpful for my use case if we could merge this. Thanks.

Jun 21 '23 00:06 Yard1

text-generation-inference text-generation-inference copied to clipboard

Do not init process group if already initialized

What does this PR do?

Before submitting

Who can review?

text-generation-inference
text-generation-inference copied to clipboard