nalcos
nalcos copied to clipboard
Try out other models.
Currently, we are using multi-qa-MiniLM-L6-cos-v1, which has a speed (sentences encoded/sec on 1 V100 GPU) of 14200 and a model size of 80 MB. We should try out other models to see if we can get better performance and speed out of them.
Additionally, we can also try using other types of tokenizers.
Further reading:
- https://www.sbert.net/docs/pretrained_models.html
- https://huggingface.co/sentence-transformers
- https://huggingface.co/transformers/tokenizer_summary.html