FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

Problem with fine-tuning bge-reranker-v2-gemma

Open Ask-sola opened this issue 11 months ago • 2 comments

I followed the code below to fine-tune the model:

torchrun --nproc_per_node 2 \
    -m FlagEmbedding.finetune.reranker.decoder_only.base \
    --model_name_or_path BAAI/bge-reranker-v2-gemma \
    --use_lora True \
    --lora_rank 32 \
    --lora_alpha 64 \
    --use_flash_attn True \
    --target_modules q_proj k_proj v_proj o_proj \
    --save_merged_lora_model True \
    --model_type decoder \
    --cache_dir ./cache/model \
    --train_data /root/autodl-tmp/fine_tune_data_train.jsonl \
    --cache_path ./cache/data \
    --train_group_size 8 \
    --query_max_len 512 \
    --passage_max_len 512 \
    --pad_to_multiple_of 8 \
    --knowledge_distillation False \
    --query_instruction_for_rerank 'A: ' \
    --query_instruction_format '{}{}' \
    --passage_instruction_for_rerank 'B: ' \
    --passage_instruction_format '{}{}' \
    --output_dir ./test_decoder_only_base_bge-reranker-v2-minicpm-layerwise \
    --overwrite_output_dir \
    --learning_rate 2e-4 \
    --bf16 \
    --num_train_epochs 20 \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 1 \
    --dataloader_drop_last True \
    --warmup_ratio 0.1 \
    --gradient_checkpointing \
    --weight_decay 0.01 \
    --deepspeed ../ds_stage0.json \
    --logging_steps 1 \
    --save_steps 1000

After obtaining a checkpoint folder, I tried loading the model with the following code:

reranker = FlagLLMReranker('/data1/hya/private/old_book/second/checkpoint-140', cache_dir='/data1/hya/cache')

However, I am unable to load the model and encountered the following error:

Traceback (most recent call last):
  File "/data1/hya/private/old_book/second/GetHardBookPairs.py", line 39, in <module>
    reranker = FlagLLMReranker('/data1/hya/private/old_book/second/checkpoint-140',cache_dir='/data1/hya/cache')
  File "/data1/hya/miniconda3/envs/LLM/lib/python3.10/site-packages/FlagEmbedding/inference/reranker/decoder_only/base.py", line 180, in __init__
    self.model = AutoModelForCausalLM.from_pretrained(
  File "/data1/hya/miniconda3/envs/LLM/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
  File "/data1/hya/miniconda3/envs/LLM/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4041, in from_pretrained
    model.load_adapter(
  File "/data1/hya/miniconda3/envs/LLM/lib/python3.10/site-packages/transformers/integrations/peft.py", line 188, in load_adapter
    peft_config = PeftConfig.from_pretrained(
  File "/data1/hya/miniconda3/envs/LLM/lib/python3.10/site-packages/peft/config.py", line 152, in from_pretrained
    return cls.from_peft_type(**kwargs)
  File "/data1/hya/miniconda3/envs/LLM/lib/python3.10/site-packages/peft/config.py", line 119, in from_peft_type
    return config_cls(**kwargs)
TypeError: LoraConfig.__init__() got an unexpected keyword argument 'eva_config'

Ask-sola avatar Jan 21 '25 15:01 Ask-sola

You need to load model from output_dir/merged_model, not checkpoint-*. The model in checkpoint-* is the LORA weight.

545999961 avatar Jan 23 '25 06:01 545999961

I'm finetuing bge-reranker-v2-minicpm-layerwise. I also set the parameter save_merged_lora_model with True, however, I can't find the megerd_model in outpu_dir, could you help me ? The all parameters for finetune are as follows:

Image Image

the content listed in output dir are:

Image

LawsonAbs avatar May 29 '25 02:05 LawsonAbs