Ditto P S comments

Results 55 comments of


                                            Ditto P S

SFT full parameter finetuning - Unable to load the model

I have tried ZeRO3, but the RAM is getting overloaded because of the parameter offloading. I currently have 420GB RAM and 4 A100 80 GB.

SFT full parameter finetuning - Unable to load the model

Any thoughts on how many GPUs might be needed? Also, I'm not seeing the current 4 GPU's getting filled with ZeRO3, rather it's taking the RAM.

SFT full parameter finetuning - Unable to load the model

Thanks for the support.

LlamaForCausalLM.forward() got an unexpected keyword argument 'use_flash_attention'

I'm not calling that function in my script. I was following the example here to enable flash attn. https://github.com/huggingface/optimum-habana/blob/main/examples/language-modeling/run_lora_clm.py Here is my train script ``` import pickle import os from...

LlamaForCausalLM.forward() got an unexpected keyword argument 'use_flash_attention'

Here is the command ``` deepspeed train-gaudi.py --base_model budecosystem/boomer-1b --output_dir output/boomer --data_path roneneldan/TinyStories --learning_rate 1e-3 --num_train_epochs 1 --per_device_train_batch_size 2 --gradient_accumulation_steps 1 --lr_scheduler_type cosine --warmup_ratio 0.1 --report_to wandb --logging_steps 10 --save_strategy...

LlamaForCausalLM.forward() got an unexpected keyword argument 'use_flash_attention'

Thanks for your support. For some reason, I'm able to run the script without any issues now. I have another question, does this flash attention have the same effect as...

LlamaForCausalLM.forward() got an unexpected keyword argument 'use_flash_attention'

I'm using Gaudi2

While weight conversion of llama-13b getting this error: RuntimeError: Internal: unk is not defined.

I tried with the latest code from the main branch, but still getting the same issue

While weight conversion of llama-13b getting this error: RuntimeError: Internal: unk is not defined.

@ArthurZucker I have the META weights and tokenizer. The issue share is with that. For sentencepiece, is there a specific version to be used?

While weight conversion of llama-13b getting this error: RuntimeError: Internal: unk is not defined.

@egoetz where you able to solve this issue?