getao comments

Results 8 comments of


                                            getao

Can shallow fusion support arbitrary language models other than RNNLM?

Thanks @jsenellart . I'll have a look at the hook mechanism.

Running the EdgeLM model on XSum

Hi, (1) you can try fairseq-interactive to use the model in the interactive mode. (2) Yes, but you need to fine-tune the model for XSum. The pretrained weights cannot be...

EdgeLM : ONNX runtime

Thanks for your interest. We're working on open sourcing the onnx inference with int8 quantized model. Will release the code soon.

use_reentrant=False can't be set properly

> Hi there! It seems like you are encountering a warning related to the `use_reentrant` parameter when training your model with deepspeed and transformers. The warning is advising you to...

use_reentrant=False can't be set properly

> I couldn't recreate this warning. Can you try either: > > 1. Using the dev version of transformers `pip install git+https://github.com/huggingface/transformers` > 2. Upgrading your PyTorch version `pip install...

activation_checkpointing error when using --fsdp

Sure. ``` def main(): parser = transformers.HfArgumentParser((ModelArguments, DataArguments, TrainingArguments)) model_args, data_args, training_args = parser.parse_args_into_dataclasses() data_prefix = data_args.data_path train_file = f"{data_prefix}.train.json" # text data for language modeling (the next word prediction...

activation_checkpointing error when using --fsdp

> @vtien @getao Do you get the CheckpointError with `use_flash_attention_2=False`? I didn't try `use_flash_attention_2=False`

[BUG] CUDA error: no kernel image is available for execution on the device

> Hi @getao - if you do build with DS_BUILD_FISED_ADAM=1 pip install deepspeed do you get the same erorr? Yes, I built with DS_BUILD_FUSED_ADAM=1. BTW, I don't get the error...