alpaca-lora
alpaca-lora copied to clipboard
Finetuning is so slow on bloomz-7b1
Hi,
Are there any configurations for other models than llama? When I first tried to run the finetune script for blooomz-7b1
model, I had this error:
ValueError: Target modules ['q_proj', 'v_proj'] not found in the base model. Please check the target modules and try again.
Then I change the TARGET_MODULES to:
TARGET_MODULES = [
"query_key_value"
]
The script run, but it's so slow
Tooks about 15 hours to finish.
I use A100 40GB, it should be faster than yours. Am I missing something here?
When I used the original script (using decapoda-research/llama-7b-hf
model), it still so slow compared to yours.
Tooks about 9 hours to finish
any help on this?
hi, do you met error "expected scalar type Half but found Float" after change model to bloomz?
no, I only get this error: ValueError: Target modules ['q_proj', 'v_proj'] not found in the base model. Please check the target modules and try again.
Did you able to do something different?
@raihan0824 how did you get the target modules for any particular model?
@raihan0824 how did you get the target modules for any particular model?
I'm not sure, but I found that for BLOOM, the target module is query_key_value
.
Source: https://github.com/PhoebusSi/Alpaca-CoT
from transformers import GPTNeoXForCausalLM
model = GPTNeoXForCausalLM.from_pretrained("EleutherAI/gpt-neox-20b")
for name, param in model.named_parameters(): print(name, param.shape)
For any model