Results 317 comments of Niklas

Hm what model are you using? I'd recommend switching to a bigger / better one, specifically I'd recommend this one: https://huggingface.co/GritLM/GritLM-7B

I havn't looked into that. It would likely reduce the expressivity of the embeddings, so I would expect worse results, but it may still be good enough to make the...

The command i used is here: https://huggingface.co/bigscience/sgpt-bloom-7b1-msmarco I ran on 8 A100 GPUs w/ 80GB I think

You can check Section 4.4 of the MTEB paper (https://arxiv.org/pdf/2210.07316.pdf) where https://huggingface.co/bigscience/sgpt-bloom-7b1-msmarco is benchmarked on many languages incl. Korean & Japanese against other models. As it hasn't extensively seen them...

For most models you can significantly increase the sequence length. If you load via SentenceTransformer you can do the following after loading the model: ``` # Change the length to...

Yeah the reason the sequence length is set to 300 during training is that it saves a lot of memory & for many cases 300 tokens are enough to determine...

> To confirm, is there a difference between _sequence_ length and _token_ length? Or do they mean the same? It's the same, i.e. sequence length is measured in tokens

Sure; For asymmetric search (e.g. retrieval), I'd recommend https://huggingface.co/bigscience/sgpt-bloom-7b1-msmarco which has seen lots of Chinese during pretraining

> > Sure; For asymmetric search (e.g. retrieval), I'd recommend https://huggingface.co/bigscience/sgpt-bloom-7b1-msmarco which has seen lots of Chinese during pretraining > > Thanks you very much! Do you mean this code?:https://github.com/Muennighoff/sgpt#asymmetric-semantic-search-be...

It works fine on my side, see this notebook: https://colab.research.google.com/drive/1mxH15422ZnguaItPBKR2PZ_qACM5QE0l?usp=sharing Maybe you're on an older transformers version?