transformers
transformers copied to clipboard
[deepspeed zero3] need `generate(synced_gpus=True, ...)`
As discussed in https://github.com/huggingface/transformers/issues/22231 generate under deepspeed zero3 using different input streams on different gpus may hang. It's documented here and in the API docs, that synced_gpus=True is required but who reads the docs.
So this PR will automatically turn this flag on under ZeRO Stage-3, so everything works out of box and it'll warn the user once for their awareness.
Fixes: https://github.com/huggingface/transformers/issues/22231
The documentation is not available anymore as the PR was closed or merged.
So a potential issue I see is that deepspeed is potentially not enabled on all modele, & this would enable synced_gpus even for models where it's not enabled? depends on how is_deepspeed_zero3_enabled works, which you likely know better than me
- For Accelerate and HF Trainer everything is done automatically for you.
- If you build your own trainer and follow the instructions it'll work as well.
- Are you proposing:
def generate(..., synced_gpus=None)
[...]
if synced_gpus == None:
if is_deepspeed_zero3_enabled() and dist.world_size() > 1:
synced_gpus = True
else:
synced_gpus = False
which would preserve BC wrt current synced_gpus=False in the function definition.
yes?
-
and no warning needed then? or still keeping it?
-
now docs will be mismatching so will need to adapt those to say that by default with multi-gpu it'll be set to
True, but the user can choose to set it toFalseif they want to.
Yes, your code is exactly what I'm suggesting. I think it would be a better API since the user wouldn't have to look for warnings (no need for a warning indeed in this case) and would preserve backward compatibility as you mention.
That sounds good. Thank you for proposing it, Sylvain.
So no warning needed, right? As this logic is really about dynamic default setting and it'll be documented as such.
Yup!
Thank you for suggesting a more elegant solution than my initial one, Sylvain.
thanks folks, this is great