FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

Multilingual Models

Open Siegi96 opened this issue 1 year ago • 11 comments

Do you plan to train and release multilingual embedding models in the near future?

Siegi96 avatar Aug 07 '23 09:08 Siegi96

The multilingual model is in progress, but we cannot confirm the timing of the release. Besides, which language is your need? We can consider adding it in the future.

staoxiao avatar Aug 07 '23 11:08 staoxiao

Thanks for your fast answer, good to hear that you are working on it. For me personally its english, spanish, german and french.

Keep up the awsome work, your models are totally impressive.

Siegi96 avatar Aug 07 '23 11:08 Siegi96

Thanks for your interest! We will constantly improve this project.

staoxiao avatar Aug 07 '23 11:08 staoxiao

The multilingual model is in progress, but we cannot confirm the timing of the release. Besides, which language is your need? We can consider adding it in the future.

Thank you for your ongoing efforts in expanding the multilingual capabilities. Adding Arabic to your list of languages would not only serve a significant user base but would also greatly assist individuals like myself who frequently interact in the language. Your consideration of this request would be deeply appreciated.

nhaouari avatar Aug 07 '23 22:08 nhaouari

"code" would be a useful language to add, especially common languages like python and javascript.

The GTE project claims this ability: https://huggingface.co/thenlper/gte-large

freckletonj avatar Aug 12 '23 01:08 freckletonj

Please add Lithuanian language

sinia avatar Aug 13 '23 06:08 sinia

Please add Lithuanian language

Also it would make sense to add Latvian in addition to Lithuanian language, as those two languages are closely related, should improve model's performance for both languages.

sinia avatar Aug 16 '23 10:08 sinia

@staoxiao Would you support Japanese? Is there an expected release date?

jingedawang avatar Sep 11 '23 08:09 jingedawang

@staoxiao Would you support Japanese? Is there an expected release date?

Yes. If there are no accidents, it will be released in about a month.

staoxiao avatar Sep 11 '23 13:09 staoxiao

I apologize for the late release. We release a new model: BGE-M3 that supports multilingual, long text and multiple retrieval modes. Feel free to use it and provide feedback.

staoxiao avatar Jan 31 '24 02:01 staoxiao

@staoxiao What languages is BGE-M3 supported ? Is there a list somewhere ? Thanks

x4080 avatar Feb 04 '24 21:02 x4080