alpaca-lora Finetuning is so slow on bloomz-7b1

Finetuning is so slow on bloomz-7b1

Open raihan0824 opened this issue 1 year ago • 7 comments

Hi,

Are there any configurations for other models than llama? When I first tried to run the finetune script for blooomz-7b1 model, I had this error: ValueError: Target modules ['q_proj', 'v_proj'] not found in the base model. Please check the target modules and try again.

Then I change the TARGET_MODULES to:

TARGET_MODULES = [
    "query_key_value"
]

The script run, but it's so slow Screenshot 2023-03-24 at 04 10 20 Tooks about 15 hours to finish.

I use A100 40GB, it should be faster than yours. Am I missing something here?

Mar 23 '23 21:03 raihan0824

When I used the original script (using decapoda-research/llama-7b-hf model), it still so slow compared to yours.

Screenshot 2023-03-24 at 22 32 41

Tooks about 9 hours to finish

Mar 24 '23 15:03 raihan0824

any help on this?

Mar 25 '23 15:03 raihan0824

hi, do you met error "expected scalar type Half but found Float" after change model to bloomz?

Mar 31 '23 10:03 scarydemon2

no, I only get this error: ValueError: Target modules ['q_proj', 'v_proj'] not found in the base model. Please check the target modules and try again. Did you able to do something different?

Mar 31 '23 10:03 raihan0824

@raihan0824 how did you get the target modules for any particular model?

Apr 11 '23 11:04 satani99

@raihan0824 how did you get the target modules for any particular model?

I'm not sure, but I found that for BLOOM, the target module is query_key_value.

Source: https://github.com/PhoebusSi/Alpaca-CoT

Apr 11 '23 14:04 raihan0824

from transformers import GPTNeoXForCausalLM

model = GPTNeoXForCausalLM.from_pretrained("EleutherAI/gpt-neox-20b")

for name, param in model.named_parameters(): print(name, param.shape)

For any model

Apr 13 '23 08:04 satani99

alpaca-lora alpaca-lora copied to clipboard

Finetuning is so slow on bloomz-7b1

alpaca-lora
alpaca-lora copied to clipboard