Wang, Yi

Results 69 comments of Wang, Yi

should work with https://github.com/microsoft/DeepSpeed/pull/3035

@sgugger I see code like "from transformers.models.bloom.modeling_bloom import build_alibi_tensor" in petals, if we make this a method, the petals code needs to be changed as well. may happen to other...

Hi @younesbelkada. When I use deepspeed zero3 and prompt tuning to finetune large model. find training hang after saving checkpoint. the prompt-tuning has prompt_encoder forward in model save_pretrained. only rank0...

yes, I think should_save should be move to _save. instead of controlling if _save is called like _save_tpu, WDYT?

any thought about my proposal? I think should_save should be move to _save. instead of controlling if _save is called like _save_tpu. @pacman100