FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

how to adjust hyperparameter for finetune llm embed

Open QuangTQV opened this issue 1 year ago • 2 comments

llm embed has the following training script. I don't know how to adjust hyperparameters like train_batch_size, learning rate, warmup_ratio, ...

torchrun --nproc_per_node=8 run_dense.py
--output_dir data/outputs/tool
--train_data llm-embedder:tool/toolbench/train.json
--eval_data llm-embedder:tool/toolbench/test.json
--corpus llm-embedder:tool/toolbench/corpus.json
--key_template {text}
--metrics ndcg
--eval_steps 2000
--save_steps 2000
--max_steps 2000
--data_root /data/llm-embedder

QuangTQV avatar Apr 30 '24 12:04 QuangTQV

This script uses the huggingface trainer to do fine-tuning, so you can use the hyper-arguments on this page: https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments

staoxiao avatar May 02 '24 14:05 staoxiao

This script uses the huggingface trainer to do fine-tuning, so you can use the hyper-arguments on this page: https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments

How can I create a new task like llm embedded task (with separate instructions for query and separate instructions for key) ?

QuangTQV avatar May 03 '24 04:05 QuangTQV