ai-group-tabs
ai-group-tabs copied to clipboard
[feature request]: support for ollama self-hosted llm
After a few test, i found that using the open source model like mistral 7B can also do this job well. By support these self-hosted model, users don't need to worry about the networking issues related to connecting to OpenAI's network and the potential costs associated with using the models.
I use LM Studio to start a local server with success, you can try to use http://localhost:11434 in the externsion options API URL.
I use LM Studio to start a local server with success, you can try to use http://localhost:11434 in the externsion options API URL.
Only fill the api field with the local server addr and not change the model name? It looks like they are using the diffrent api path between openai and ollama. I don't think this can work but i will give it a try.
I use LM Studio to start a local server with success, you can try to use http://localhost:11434 in the externsion options API URL.
it doesn't work. maybe because the LM studio you used have the same api path with openai?
It has something to do with the prompt format and openai api compatability, LM Studio can handle them. I don't have experience with ollama, so you might need to figure it out.
It has something to do with the prompt format and openai api compatability, LM Studio can handle them. I don't have experience with ollama, so you might need to figure it out.
Yep, ollama using the different api path (you can check it out in its doc: https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion) or you can see it in my screenshot in the decscription of the issue. It looks like using the specific prompt for Mistral can have a more compelling performance, according to doc. However, I understand that all of these are achievable. If the support for Ollama is acknowledged at the product level, then I can make adjustments to the implementation details. As for the implementation itself, I can also dedicate some of my spare time to it.
example using mistral with recommand prompt template:
curl -s http://localhost:11434/api/generate -d '{
"model": "mistral",
"stream": false,
"prompt":"<s>[INST] You are a url classifier, you based on the given url to classify the browser tab type as one of the following: Development, Utilities, Entertainment. Respond with only one single word (without any explaination or punctuation) from the given list. So for instance the following: https://github.com/skyf0cker/ai-group-tags will belong to: [/INST]Development</s>[INST]https://reddit.com[/INST]"
}' | jq '.response'
response: "Entertainment"
I definetely think supporting local LLM is the ideal choice. The task is well-suited for a small local LLM. Any contributions are welcome! 👍
right now you can use https://nitro.jan.ai/ it supports an openai compatible endpoint
I think the task even doesn't need a local LLM, it can be done with traditional embedding. Just run the embedding and classify inside browser with JavaScript. It's faster and protecting users' privacy
How about adding keywords for classification, and processing them just like Filter Rules?
I believe that Candle is an excellent choice, and I recommend considering support it. Candle primarily focuses on serverless inference and provides the ability to run models within browsers using wasm.
#77 is also talking about local first computation support.
Yes, using the LLM is a bit of overkill, our task is relatively simple. #77 make big picture tradoff
I think the best solution is training or using a small model running in the browser
I hold the identical opinion; I attempted the Microsoft/Phi-2 model (very small around 2.7G), but unfortunately, it did not perform well on this classification task.