Stas Bekman
Stas Bekman
Thank you for the report, @noob-ctrl Please let me know if this fix works for you: https://github.com/huggingface/transformers/pull/22193
Thank you for testing, @noob-ctrl - the PR has been merged.
Thank you for the full traceback, @ksopyla. Now it's easy to support you. Please try again with the latest version of transformers. You can see here that this situation has...
ah, ok, thank you for clarifying the situation - that's even simpler then. Just upgrade transformers, change nothing in your setup and it should just work. The original code just...
Best to discuss a new issue in a new Issue, but if we can wrap it up quickly - it's absolutely normal that the speed will progressively drop as you...
An explanation is needed here. The Deepspeed team had to invent their own tensor substitute since 2 years ago nothing of a kind existed in pytorch. They had to replace...
Totally. Thank you for bringing it up, @JulesGM The API for checking this situation is already available and is being used in the HF Trainer: https://github.com/huggingface/transformers/blob/bec075612a293a66022937f65ba0c0df25224d29/src/transformers/trainer_seq2seq.py#L180-L188 For DIY integration we...
Thank you for letting me know your preference, please try this PR and let me know if it solves the problem for you, @JulesGM https://github.com/huggingface/transformers/pull/22242 I decided to just set...
- For Accelerate and HF Trainer everything is done automatically for you. - If you build your own trainer and follow [the instructions](https://huggingface.co/docs/transformers/main/main_classes/deepspeed#nontrainer-deepspeed-integration) it'll work as well.
1. Are you proposing: ``` def generate(..., synced_gpus=None) [...] if synced_gpus == None: if is_deepspeed_zero3_enabled() and dist.world_size() > 1: synced_gpus = True else: synced_gpus = False ``` which would preserve...