stanford_alpaca Problem with finetuning bloom

What is the fsdp_transformer_layer_cls_to_wrap for bloom?

When I tried to fine tune with bloomz-7b1, the training stuck on 0%. As you said in the readme, it's most likely because I dont set the right fsdp_transformer_layer_cls_to_wrap . But I cant find it in the bloom config.

Kindly need a help on this. Thank you

Mar 21 '23 02:03 raihan0824

I get the same question. Does the traing code here only support llama or opt model? Can we finetune the bloom using its official training framework with stanford_alpaca data?

Mar 25 '23 06:03 frankzhao112

any help on this?

Mar 25 '23 15:03 raihan0824

No， I have the same issue. Do u know BELLE, they use bloom as the base model instead of llama.

Mar 26 '23 01:03 frankzhao112

certificado ambietnal gis.pdf

Mar 26 '23 02:03 garcesmarc

No， I have the same issue. Do u know BELLE, they use bloom as the base model instead of llama.

I've read it and it's exactly what I’m looking for. However, I can't find the finetuning script, any help on this?

Mar 26 '23 06:03 raihan0824

It seems like the finetuning script is referring back to this repo based on this https://github.com/LianjiaTech/BELLE/issues/26, which is our problem

Mar 26 '23 07:03 raihan0824

I have same issue.

Mar 27 '23 02:03 quanliu1991

Are u Chinese? 说汉语呗

Mar 28 '23 06:03 frankzhao112

u can check bloom training code in bloom github. Bloom already opens its trainning code, I think u can find tranning code there.

Mar 28 '23 06:03 frankzhao112

change to this: --fsdp_transformer_layer_cls_to_wrap 'BloomBlock' and it works

Mar 29 '23 03:03 floodsung

change to this: --fsdp_transformer_layer_cls_to_wrap 'BloomBlock' and it works

thanks, but still error: tensor a (256905216) must match the size of tensor b (1027620864) is there hyper param need to be fix?

Mar 31 '23 08:03 weberrr

change to this: --fsdp_transformer_layer_cls_to_wrap 'BloomBlock' and it works

still gets the same error, what type of bloom model are you running? can you please share the training script?

Mar 31 '23 09:03 raihan0824

how do you run it?

Mar 31 '23 09:03 raihan0824

I used this to run with the original training script: torchrun --nproc_per_node=3 --master_port=5001 train.py \ --model_name_or_path bigscience/bloomz-7b1 \ --data_path ./alpaca_data.json \ --bf16 True \ --output_dir ./model_trained \ --num_train_epochs 3 \ --per_device_train_batch_size 4 \ --per_device_eval_batch_size 4 \ --gradient_accumulation_steps 8 \ --evaluation_strategy "no" \ --save_strategy "steps" \ --save_steps 2000 \ --save_total_limit 1 \ --learning_rate 2e-5 \ --weight_decay 0. \ --warmup_ratio 0.03 \ --lr_scheduler_type "cosine" \ --logging_steps 1 \ --fsdp "full_shard auto_wrap" \ --fsdp_transformer_layer_cls_to_wrap ‘BloomBlock‘ \ --tf32 True

and gets this error: Exception: Could not find the transformer layer class to wrap in the model.

Mar 31 '23 09:03 raihan0824

how do you run it?

i can run my code, it can load model and data, but still mem error like this:

CUDA out of memory. Tried to allocate 770.00 MiB (GPU 0; 79.35 GiB total capacity; 75.33 GiB already allocated; 679.19 MiB free; 77.53 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management an

Mar 31 '23 09:03 weberrr

change your transformers>=4.23 and try

Mar 31 '23 09:03 weberrr

I use transformers 4.27.4

Mar 31 '23 09:03 raihan0824

how do you run it?

i can run my code, it can load model and data, but still mem error like this:

CUDA out of memory. Tried to allocate 770.00 MiB (GPU 0; 79.35 GiB total capacity; 75.33 GiB already allocated; 679.19 MiB free; 77.53 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management an

its because you lack gpu memory, try to run it with more gpu

Mar 31 '23 09:03 raihan0824

I think your code is ok

Mar 31 '23 10:03 weberrr

stanford_alpaca stanford_alpaca copied to clipboard

Problem with finetuning bloom

stanford_alpaca
stanford_alpaca copied to clipboard