getao
getao
Thanks @jsenellart . I'll have a look at the hook mechanism.
Hi, (1) you can try fairseq-interactive to use the model in the interactive mode. (2) Yes, but you need to fine-tune the model for XSum. The pretrained weights cannot be...
Thanks for your interest. We're working on open sourcing the onnx inference with int8 quantized model. Will release the code soon.
> Hi there! It seems like you are encountering a warning related to the `use_reentrant` parameter when training your model with deepspeed and transformers. The warning is advising you to...
> I couldn't recreate this warning. Can you try either: > > 1. Using the dev version of transformers `pip install git+https://github.com/huggingface/transformers` > 2. Upgrading your PyTorch version `pip install...
Sure. ``` def main(): parser = transformers.HfArgumentParser((ModelArguments, DataArguments, TrainingArguments)) model_args, data_args, training_args = parser.parse_args_into_dataclasses() data_prefix = data_args.data_path train_file = f"{data_prefix}.train.json" # text data for language modeling (the next word prediction...
> @vtien @getao Do you get the CheckpointError with `use_flash_attention_2=False`? I didn't try `use_flash_attention_2=False`
> Hi @getao - if you do build with DS_BUILD_FISED_ADAM=1 pip install deepspeed do you get the same erorr? Yes, I built with DS_BUILD_FUSED_ADAM=1. BTW, I don't get the error...