gritlm issues

attn attribute setting

8

```python attn: str = field( default='bbcc', metadata={ "help": "bidirectional/causal attn for emb inst., emb sample, gen inst., gen sample" " e.g. bbcc is bidirectional over both emb inst. & sample...

louieworth

CustomRandomSampler not working in huggingface Trainer and Accelerator

1

# Issue When I test the , it seems that the huggingface Trainer and Accelerator will replace the Sampler by a new object. Please refer to code: [get_train_dataloader function in...

YanshekWoo

bidirectional attention or casual attention for embedding?

5

You mention that bidirectional attention is used for embedding task. But it appears that you only use the last hidden states from the pretrained LLM to generate embeddings. Is the...

yonxie

Unable to load trained model

1

I trained embedding model on toy dataset as suggested on the repo `torchrun --nproc_per_node 1 -m training.run --output_dir test_path --model_name_or_path openaccess-ai-collective/tiny-mistral --train_data training/toy_data_instruct/toy_data_embedding.jsonl --learning_rate 1e-5 --num_train_epochs 5 --per_device_train_batch_size 2 --dataloader_drop_last...

deepakkr-singh

Is there hope Grit-Embedding beats this task?

2

Thank you for this great model and the corresponding paper. I will definitely cite you in my thesis :) In the attached experiment, I am trying to "trick" the model...

marioeljuga

E5 dataset

1

Hello！ This is awesome work and the idea of using LLM as the embedding model is amazing. More importantly, you really did it and the performance is surprising good! I...

wangskyGit

RuntimeError

1

When I run the script of Training Unified model (GRIT)。 got a error: **RuntimeError: NVML_SUCCESS == DriverAPI::get()->nvmlDeviceGetHandleByPciBusId_v2_( pci_id, &nvml_device) INTERNAL ASSERT FAILED at "../c10/cuda/CUDACachingAllocator.cpp":1139, please report a bug to PyTorch.**...

BlackHandsomeLee

Are the embedding head weights published?

1

Hello, [Here in eval_mteb.py](https://github.com/ContextualAI/gritlm/blob/b89fdefa18731f1aa1d6111c3849c1e4c811b9d6/evaluation/eval_mteb.py#L1146), the weights of the embeddings (down) projection layer are loaded from `embedding_head.bin`. As far as I could infer this file is not published as a part...

fssh72

which model should be used?

5

Hi and thank you for sharing this amazing work. i want to use Gritlm to produce embeddings to be stored in some vector database for document retrieval. But. there are...

Elktrn

Dose your evaluation code support data parallelism?

3

Dose your evaluation code support data parallelism? I couldn't find any module to mteb evaluation that use DP or DDP

selmiss

gritlm
gritlm copied to clipboard

Metadata

attn attribute setting

CustomRandomSampler not working in huggingface Trainer and Accelerator

bidirectional attention or casual attention for embedding?

Unable to load trained model

Is there hope Grit-Embedding beats this task?

E5 dataset

RuntimeError

Are the embedding head weights published?

which model should be used?

Dose your evaluation code support data parallelism?

← Metadata

Owner

Metadata

gritlm gritlm copied to clipboard

Metadata

← Metadata

Owner

Metadata

gritlm
gritlm copied to clipboard