FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

FlagEmbedding and LlamaIndex

Open ammarmol opened this issue 1 year ago • 1 comments

Hello, i am trying to use LlamaIndex and FlagEmbedding together but it is really difficult. Could you provide a simple example of it? Is there a possibility to train a FlagEmbedding model in python with the call of a function such a "model = FlagEmbedding.baai_general_embedding.finetune( output_dir="./", model_name_or_path="BAAI/bge-large-zh-v1.5", train_data="./result.jsonl" , learning_rate=1e-5, num_train_epochs=5, per_device_train_batch_size=8, dataloader_drop_last=True, normlized=True, temperature=0.02, query_max_len=64, passage_max_len=256, train_group_size=2, logging_steps=10, query_instruction_for_retrieval="" ) ? model.run()

ammarmol avatar Mar 06 '24 17:03 ammarmol

Hi, thanks for your interest in our work. You can use HuggingfaceEmbedding to load bge model in LlamaIndex. And LlamaIndex has its training script that you can use. If you want to use FlagEmbedding to train model with python, a simple method is using the system command: os.system("torchrun --nproc_per_node {number of gpus} -m FlagEmbedding.baai_general_embedding.finetune.run ... ")

staoxiao avatar Mar 07 '24 03:03 staoxiao