aider icon indicating copy to clipboard operation
aider copied to clipboard

Enhancement: /help shouldn't use an english only model as the user question can be in another language

Open thiswillbeyourgithub opened this issue 1 year ago • 0 comments

Hi,

Low priority but I noticed that the /help how to change models type of command triggers an embedding search using the model BAAI/bge-small-en-v1.5. This model is english only, which seems fine given that the documentation is in english. But actually some users can be accustomed to talking to their LLM in other languages so the embeddings should rather be multilingual.

Here's the relevant line: https://github.com/paul-gauthier/aider/blob/b8ce472cb6181ebe52892823a5b25ba6ad3a18e9/aider/help.py#L111

The top multilingual embedding model seems to be bge-m3

Edit: having said all that I notice that the model you're currently using is 130Mb large while BGE-M3 is 2300MB so I guess that's the reason. Anyway I'm leaving that up to notify you of that situation. I'm sure a smaller version will come up at some point. Feel free to close this :)

Edit2: actually jina made another open weights model that's multilingual and about 1Go, so a good middle ground maybe? Here's the link

thiswillbeyourgithub avatar Sep 10 '24 13:09 thiswillbeyourgithub