gritlm
gritlm copied to clipboard
Generative Representational Instruction Tuning
```python attn: str = field( default='bbcc', metadata={ "help": "bidirectional/causal attn for emb inst., emb sample, gen inst., gen sample" " e.g. bbcc is bidirectional over both emb inst. & sample...
# Issue When I test the , it seems that the huggingface Trainer and Accelerator will replace the Sampler by a new object. Please refer to code: [get_train_dataloader function in...
You mention that bidirectional attention is used for embedding task. But it appears that you only use the last hidden states from the pretrained LLM to generate embeddings. Is the...
I trained embedding model on toy dataset as suggested on the repo `torchrun --nproc_per_node 1 -m training.run --output_dir test_path --model_name_or_path openaccess-ai-collective/tiny-mistral --train_data training/toy_data_instruct/toy_data_embedding.jsonl --learning_rate 1e-5 --num_train_epochs 5 --per_device_train_batch_size 2 --dataloader_drop_last...
Thank you for this great model and the corresponding paper. I will definitely cite you in my thesis :) In the attached experiment, I am trying to "trick" the model...
Hello! This is awesome work and the idea of using LLM as the embedding model is amazing. More importantly, you really did it and the performance is surprising good! I...
When I run the script of Training Unified model (GRIT)。 got a error: **RuntimeError: NVML_SUCCESS == DriverAPI::get()->nvmlDeviceGetHandleByPciBusId_v2_( pci_id, &nvml_device) INTERNAL ASSERT FAILED at "../c10/cuda/CUDACachingAllocator.cpp":1139, please report a bug to PyTorch.**...
Hello, [Here in eval_mteb.py](https://github.com/ContextualAI/gritlm/blob/b89fdefa18731f1aa1d6111c3849c1e4c811b9d6/evaluation/eval_mteb.py#L1146), the weights of the embeddings (down) projection layer are loaded from `embedding_head.bin`. As far as I could infer this file is not published as a part...
Hi and thank you for sharing this amazing work. i want to use Gritlm to produce embeddings to be stored in some vector database for document retrieval. But. there are...
Dose your evaluation code support data parallelism? I couldn't find any module to mteb evaluation that use DP or DDP