Support LM Studio /v1/embeddings endpoint
Is your feature request related to a problem? Please describe. I tried to set up the new mxbai-embed-large model using the new LM Studio 0.2.19 beta which supports /v1/embeddings but was not able to get it to work. Even with the embedding model which already works with Ollama (nomic embed text v1.5), it was very slow and quite buggy when using it as LM Studio Embedding Model.
Describe the solution you'd like Ideally a workflow as below would work
I want to try mxbai-embed-large by using LM Studio v0.2.19 as they added embedding model support there.
It's a bit of a manual process right now, but works:
- Download LM Studio v0.2.19 for embedding model support - https://lmstudio.ai/beta-releases.html
- Download the model GGUF from https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1/tree/main/gguf
- Put the GGUF in the model folder of LM Studio with the correct folder structure
- Reload LM Studio
- You can select the mxbai model now as embedding model
- In Obsidian in the Copilot setting page use the embedding proxy functionality
6.1 Set the URL to http://localhost:1234/v1
6.2. Set the model to "mixedbread-ai/mxbai-embed-large-v1" and use "text-embedding-ada-002" in the embedding model selection🎉 you now have local embeddings with LM Studio using mxbai-embed-large
Describe alternatives you've considered Using Ollama as they added mxbai-embed-large recently. But this again is not supported yet in Copilot, as mentioned in #401
Additional context Add any other context or screenshots about the feature request here.
Thank you for sharing, I believe this would be a huge improvement for editor plugins! I can help review and test if someone has the knowledge to take this on :)
+1, would be awesome
please
Nomic-embed-text limits the use of Ollama, so it now supports more embedding models: mxbai-embed-large, snowflake-arctic-embed, dmeta-embedding-zh, etc.
See https://ollama.com/blog/embedding-models
BTW, the future Ollama will also support the /v1/embeddings OpenAI-compatible endpoint.
Does LM Studio support spinning up two local servers at the same time? One chat, one embedding?
@logancyang It's not 2 local servers strictly speaking, but yes, /completions and /embeddings endpoints can be used at the same time by loading one model for each purpose. Please see LM Studio docs for details.
@mhafellner That's interesting, didn't know that! But from a glance it seems it won't work. The real question is can it load 2 models at the same time? There's no single model that does /completions and /embeddings well enough for RAG.
It does Indeed load two separate models.
On Sat, 10 Aug 2024, 21:02 Logan Yang, @.***> wrote:
@mhafellner https://github.com/mhafellner That's interesting, didn't know that! But from a glance it seems it won't work. The real question is can it load 2 models at the same time? There's no single model that does /completions and /embeddings well enough for RAG.
— Reply to this email directly, view it on GitHub https://github.com/logancyang/obsidian-copilot/issues/408#issuecomment-2282262986, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIZO6GWKNY7X6ULHITESJTZQZWUBAVCNFSM6AAAAABF2MV7UWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBSGI3DEOJYGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Does anybody know what the context length is here, is it 512 or 2048, I'm confused
I'm trying to use a local embedding model (nomic-ai/nomic-embed-text-v1.5-GGUF) from LM Studio in v2.6.1
Here is the LM Studio setup:
These are the options for the custom Chat model w/ LM Studio listed (The custom Chat model worked!)
These are the options for the custom Embeddings w/o LM Studio - The custom Embeddings model did not work
I tried using the "openai-format" and used the embeddings url from LM Studio.
I received this error when reloading
@shwanton I noticed you have "serve on local network" on, then you must use 3rd party openai format as provider instead of lm-studio. And you must provide the correct base url, it is not localhost since you serve on local network, you can find the correct IP address and port in LM Studio. Also don't have embedding in the custom URL, it should be up to /v1 and nothing after it.
The next version will skip the API key for custom embedding model if it's empty, in the meantime just put any string in the API key.
Also, some models in LM Studio may not have proper context length, so it might mess up the QA. This is the price of "customizability", you need to know what you are doing when switching out parts.
Thank you, this worked!
I turned off "serve on local network" and used the local address
I re-added the embeddings model and used an empty space for api key
Also, some models in LM Studio may not have proper context length, so it might mess up the QA.
It looks like LM Studio added the ability to change the Evaluation Batch size:
I'm assuming this should match the embedding model batch size?
@shwanton glad it worked! Not sure about Evaluation Batch Size, is it embedding model specific? Let me know what you find out about it, I'll take a look soon as well.