Am I fine-tuning gemma-2b or bge-reranker-v2-gemma?
Dear authors,
Great work, thanks for sharing.
I am trying to fine-tune bge-reranker-v2-gemma using my own dataset.
However, according to the officail finetuning command provided:
torchrun --nproc_per_node {number of gpus} \
-m FlagEmbedding.llm_reranker.finetune_for_instruction.run \
--output_dir {path to save model} \
--model_name_or_path google/gemma-2b \
--train_data ./toy_finetune_data.jsonl \
--learning_rate 2e-4 \
--num_train_epochs 1 \
--per_device_train_batch_size 1 \
--gradient_accumulation_steps 16 \
--dataloader_drop_last True \
--query_max_len 512 \
--passage_max_len 512 \
--train_group_size 16 \
--logging_steps 1 \
--save_steps 2000 \
--save_total_limit 50 \
--ddp_find_unused_parameters False \
--gradient_checkpointing \
--deepspeed stage1.json \
--warmup_ratio 0.1 \
--bf16 \
--use_lora True \
--lora_rank 32 \
--lora_alpha 64 \
--use_flash_attn True \
--target_modules q_proj k_proj v_proj o_proj
Why the model_name_or_path is google/gemma-2b instead of bge-reranker-v2-gemma? So which model am I actually fine-tuning using this code?
I mean I wish my fine-tuned model could further enhance the capability on a certain reranking task on the top of bge-reranker-v2-gemma. Not train gemma-2b from scratch.
This will fine-tune google/gemma-2b, if you want to fine-tune bge-reranker-v2-gemma, just set model_name_or_path to bge-reranker-v2-gemma
This will fine-tune
google/gemma-2b, if you want to fine-tunebge-reranker-v2-gemma, just setmodel_name_or_pathtobge-reranker-v2-gemma
Nice, I set model_name_or_path to bge-reranker-v2-gemma.
How about other arguments? Should I keep all other arugments the same?
This will fine-tune
google/gemma-2b, if you want to fine-tunebge-reranker-v2-gemma, just setmodel_name_or_pathtobge-reranker-v2-gemmaNice, I set
model_name_or_pathtobge-reranker-v2-gemma.How about how arguments? Should I keep all other arugments the same?
If you have specific requirements, such as a larger batch size or more negatives, you can modify the other arguments accordingly. Alternatively, you can leave all other arguments at their default settings.
@545999961 Can bge-reranker-v2.5-gemma2-lightweight be fine-tuned in this way?
@545999961 Can bge-reranker-v2.5-gemma2-lightweight be fine-tuned in this way?
bge-reranker-v2.5-gemma2-lightweight cannot be fine-tuned in this way, we will release the fine-tune code in the future.
@545999961 Can bge-reranker-v2.5-gemma2-lightweight be fine-tuned in this way?
bge-reranker-v2.5-gemma2-lightweight cannot be fine-tuned in this way, we will release the fine-tune code in the future.
Additional question about the logic of fine-tuning bge-reranker-v2-gemma.
Is bge-reranker-v2-gemma fine-tuned LoRA on the top of google/gemma-2b ?
@545999961 Can bge-reranker-v2.5-gemma2-lightweight be fine-tuned in this way?
bge-reranker-v2.5-gemma2-lightweight cannot be fine-tuned in this way, we will release the fine-tune code in the future.
Additional question about the logic of fine-tuning
bge-reranker-v2-gemma.Is
bge-reranker-v2-gemmafine-tuned LoRA on the top ofgoogle/gemma-2b?
Yes.
@545999961 Can bge-reranker-v2.5-gemma2-lightweight be fine-tuned in this way?
bge-reranker-v2.5-gemma2-lightweight cannot be fine-tuned in this way, we will release the fine-tune code in the future.
hello, Can I ask when this fine-tuned code will be released? tks
我在30万Paper的QD对的数据集上微调bge-reranker-v2-gemma,梯度下降很慢,我现在设置的num_train_epochs=1,我需要多训练几轮嘛,训练进度60% loss从1.5下降到0.98,感觉一批次训练完,loss应该还是会比较高0.7左右,我训练再训练一轮嘛?
@545999961 Can bge-reranker-v2.5-gemma2-lightweight be fine-tuned in this way?
bge-reranker-v2.5-gemma2-lightweight cannot be fine-tuned in this way, we will release the fine-tune code in the future.
hello, Can I ask when this fine-tuned code will be released? tks
I have the same question too, Please give a note tks