Try out other models.

Open thepushkarp opened this issue 4 years ago • 0 comments

Currently, we are using multi-qa-MiniLM-L6-cos-v1, which has a speed (sentences encoded/sec on 1 V100 GPU) of 14200 and a model size of 80 MB. We should try out other models to see if we can get better performance and speed out of them.

Additionally, we can also try using other types of tokenizers.

Further reading:

https://www.sbert.net/docs/pretrained_models.html
https://huggingface.co/sentence-transformers
https://huggingface.co/transformers/tokenizer_summary.html

Sep 19 '21 12:09 thepushkarp