ai-group-tabs [feature request]: support for ollama self-hosted llm

After a few test, i found that using the open source model like mistral 7B can also do this job well. By support these self-hosted model, users don't need to worry about the networking issues related to connecting to OpenAI's network and the potential costs associated with using the models.

Dec 10 '23 08:12 skyf0cker

I use LM Studio to start a local server with success, you can try to use http://localhost:11434 in the externsion options API URL.

Dec 10 '23 08:12 nohzafk

I use LM Studio to start a local server with success, you can try to use http://localhost:11434 in the externsion options API URL.

Only fill the api field with the local server addr and not change the model name? It looks like they are using the diffrent api path between openai and ollama. I don't think this can work but i will give it a try.

Dec 10 '23 08:12 skyf0cker

I use LM Studio to start a local server with success, you can try to use http://localhost:11434 in the externsion options API URL.

it doesn't work. maybe because the LM studio you used have the same api path with openai?

Dec 10 '23 08:12 skyf0cker

It has something to do with the prompt format and openai api compatability, LM Studio can handle them. I don't have experience with ollama, so you might need to figure it out.

Dec 10 '23 08:12 nohzafk

It has something to do with the prompt format and openai api compatability, LM Studio can handle them. I don't have experience with ollama, so you might need to figure it out.

Yep, ollama using the different api path (you can check it out in its doc: https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion) or you can see it in my screenshot in the decscription of the issue. It looks like using the specific prompt for Mistral can have a more compelling performance, according to doc. However, I understand that all of these are achievable. If the support for Ollama is acknowledged at the product level, then I can make adjustments to the implementation details. As for the implementation itself, I can also dedicate some of my spare time to it.

example using mistral with recommand prompt template:

curl -s http://localhost:11434/api/generate -d '{
  "model": "mistral",
  "stream": false,
  "prompt":"<s>[INST] You are a url classifier, you based on the given url to classify the browser tab type as one of the following: Development, Utilities, Entertainment. Respond with only one single word (without any explaination or punctuation) from the given list. So for instance the following: https://github.com/skyf0cker/ai-group-tags will belong to: [/INST]Development</s>[INST]https://reddit.com[/INST]"
}' | jq '.response'

response: "Entertainment"

Dec 10 '23 08:12 skyf0cker

I definetely think supporting local LLM is the ideal choice. The task is well-suited for a small local LLM. Any contributions are welcome! 👍

Dec 10 '23 08:12 nohzafk

right now you can use https://nitro.jan.ai/ it supports an openai compatible endpoint

Dec 11 '23 01:12 tikikun

I think the task even doesn't need a local LLM, it can be done with traditional embedding. Just run the embedding and classify inside browser with JavaScript. It's faster and protecting users' privacy

Dec 11 '23 01:12 MichaelYuhe

How about adding keywords for classification, and processing them just like Filter Rules?

Dec 15 '23 06:12 hqwuzhaoyi

I believe that Candle is an excellent choice, and I recommend considering support it. Candle primarily focuses on serverless inference and provides the ability to run models within browsers using wasm.

#77 is also talking about local first computation support.

Dec 16 '23 14:12 nohzafk

Yes, using the LLM is a bit of overkill, our task is relatively simple. #77 make big picture tradoff

Dec 20 '23 16:12 rainzee

I think the best solution is training or using a small model running in the browser

Dec 21 '23 02:12 MichaelYuhe

I hold the identical opinion; I attempted the Microsoft/Phi-2 model (very small around 2.7G), but unfortunately, it did not perform well on this classification task.

Dec 21 '23 05:12 nohzafk