Wang, Yi
Wang, Yi
should work with https://github.com/microsoft/DeepSpeed/pull/3035
@sgugger please help review
@yao-matrix
@sgugger I see code like "from transformers.models.bloom.modeling_bloom import build_alibi_tensor" in petals, if we make this a method, the petals code needs to be changed as well. may happen to other...
@sgugger update the PR.
@sgugger please help review
Hi @younesbelkada. When I use deepspeed zero3 and prompt tuning to finetune large model. find training hang after saving checkpoint. the prompt-tuning has prompt_encoder forward in model save_pretrained. only rank0...
yes, I think should_save should be move to _save. instead of controlling if _save is called like _save_tpu, WDYT?
any thought about my proposal? I think should_save should be move to _save. instead of controlling if _save is called like _save_tpu. @pacman100
how about this? @pacman100