gritlm issues

How to train using hard negative only?

4

I understand that the GritLM fine-tuning uses both in-batch negative and hard negatives for contrastive learning. We can use in-batch negatives only by setting train group size to 1. However,...

zcakzhuu

Add typing and formatting for gritlm.py

Formatted and added typing for the main file gritlm.py. No significant changes

AlexeyVatolin

saving the trained model for inference

1

The original run.py saves the model in pytorch_model.bin, which cannot be loaded directly using the code provided in this repository. After replacing line 422 `trainer.save_model()` in training/run.py with `model.model.save_pretrained(training_args.output_dir)`, the...

zhj2022

CUDA OOM when finetuning meta-llama/Meta-Llama-3-8B-Instruct

1

I was trying to finetuning Meta-Llama-3-8B-Instruct using 4 gpus with the following command: `torchrun --nproc_per_node 4 -m training.run --output_dir llama3test --model_name_or_path meta-llama/Meta-Llama-3-8B-Instruct --train_data training/toy_data --learning_rate 1e-5 --num_train_epochs 5 --per_device_train_batch_size 1...

zhj2022

[WIP] Llama implementation for GritLM

3

Work in progress! Not tested yet. `is_causal` support added. Uploading for comments.

minsik-ai

Train llama 3.1 with GRIT

5

I'm now trying to train llama3.1 with GRIT pipeline. At first I directly change ``--model_name_or_path`` and run the training code (the training script I used is as follows) ``` #!/bin/bash...

ThisisXXZ

Evaluate GritLM-7B on MTEB datasets

9

I am trying to evaluate GritLM-7B on MTEB datasets using the provided script. ``` #!/bin/bash python /home/e/e1347696/unified_encoder_decoder/src/eval/MTEB/eval_mteb.py \ --model_name_or_path /home/e/e1347696/unified_encoder_decoder/model/GritLM-7B \ --output_folder /home/e/e1347696/unified_encoder_decoder/src/results/GritLM-7B-mteb \ --task_types Classification,Clustering,PairClassification,Reranking,Retrieval,STS,Summarization \ --batch_size 32 ```...

ThisisXXZ

Severe performance degradation during Mistral fine-tuning

5

Hi, I’m working with the GRITLM repository. And I'm training Mistral 7B and evaluating performance on the MTEB benchmark with NVIDIA RTX A6000. I first tested the pretrained mistralai/Mistral-7B-v0.1 model...

sdoubleoj

gritlm
gritlm copied to clipboard

Metadata

How to train using hard negative only?

Add typing and formatting for gritlm.py

saving the trained model for inference

CUDA OOM when finetuning meta-llama/Meta-Llama-3-8B-Instruct

[WIP] Llama implementation for GritLM

Train llama 3.1 with GRIT

Evaluate GritLM-7B on MTEB datasets

Severe performance degradation during Mistral fine-tuning

← Metadata

Owner

Metadata

gritlm gritlm copied to clipboard

Metadata

← Metadata

Owner

Metadata

gritlm
gritlm copied to clipboard