gritlm
gritlm copied to clipboard
Generative Representational Instruction Tuning
In the paper, the ablation study about attention emb and gen is interesting. Are these models all different models using each attention? Can I select causal attention for both cases...
Continuing from our conversation in https://github.com/ContextualAI/gritlm/issues/13 I just think it needed a new ticket at this point. I am trying to finetune embeddings only so I took your(@Muennighoff 's) recommendation...
Useful for batch processing and making embeddings cache of numerous documents with dataloaders. The results for dict and the vanilla strings list are identical, although for the raw tokenized 'transformers'...
Hello! I meet a problem when I train the model in the unified mode. First, I would like to share that when I **evaluate** several models in the artifacts (for...
if i use projection layer for ddp it will cause: RuntimeError: Expected to mark a variable ready only once. This error is caused by one of the following reasons: 1)...
How to add a projection module to an existing model? The hidden_state of this model is too large. i added projection=128 in the train_embonly.sh, but the result files don't not...
Hello. I wanted to use gritlm to a open-source embedding model —— gte-qwen2-7b-instruct, but I encountered some problems: ``` [rank1]: Traceback (most recent call last): [rank1]: File "/code/xx/LLM_mine/recall/reference/gritlm/gritlm/training/run.py", line 438,...
set max_steps=500, save_steps=100 When it reaches step 100, the checkpoint is saved successfully but nccl_timeout is displayed
Hi, while reviewing the licenses for this repository and the model it depends on, I noticed a potential inconsistency that could cause confusion or legal risks in some situations. Your...