Shitao Xiao
Shitao Xiao
可以生产向量,但reranker的embedding没有被训练过,不建议直接用来做检索。
可以使用微软的 intfloat/multilingual-e5-base
> 获取 reranker 产生的向量,在外部计算score 如何操作? reranker不能用于产生向量,其生成的向量没有实际意义。
如果机器上没有GPU的话,删除参数--use_gpu_for_searching。 一个机器上不能同事装有 faiss-cpu和faiss-gpu。需要都删除后,重新安装对应的包(有GPU装faiss-gpu, 没有装faiss-cpu )
缺少可读性,但并不影响训练过程中的数据读取。 代码已修改为`ensure_ascii=False`, 可clone最新代码。
可以参考开源社区的一些方案:https://github.com/puppetm4st3r/baai_m3_simple_server https://github.com/huggingface/text-embeddings-inference
The multilingual model is in progress, but we cannot confirm the timing of the release. Besides, which language is your need? We can consider adding it in the future.
Thanks for your interest! We will constantly improve this project.
> @staoxiao Would you support Japanese? Is there an expected release date? Yes. If there are no accidents, it will be released in about a month.
I apologize for the late release. We release a new model: [BGE-M3](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3) that supports multilingual, long text and multiple retrieval modes. Feel free to use it and provide feedback.