lobe-chat [Request] LLM 适配

独立 LLM 设置 Tab，为上层助手角色提供丰富的大模型选项。解决类似下面这种问题：

https://github.com/Yidadaa/ChatGPT-Next-Web/issues/2506
https://github.com/Yidadaa/ChatGPT-Next-Web/issues/371

清单：

[x] Azure OpenAI https://github.com/lobehub/lobe-chat/issues/131
[ ] Cluade https://github.com/lobehub/lobe-chat/issues/83
[ ] Replacaite LLM https://replicate.com/
[ ] 本地 LLM - LocalAI
- [ ] LLaMA
- [ ] Vinuna
- [ ] ChatGLM 6B
- [ ] Qwen-7B
- [ ] ToolLLaMA
- [ ] mistral

Sep 06 '23 12:09 arvinxx

Would like to add OpenAI's fine-tuned model call

Oct 11 '23 11:10 ddwinhzy

Would like to add OpenAI's fine-tuned model call

Yes, with custom model name

Oct 11 '23 11:10 arvinxx

本地 LLM 是否有一个明确的支持计划？

Dec 13 '23 15:12 gavinliu

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Does the local LLM have a clear support plan?

Dec 13 '23 15:12 lobehubbot

Google's Gemini Pro API is also available now. Please see the official tweet.

Dec 13 '23 16:12 jyboy

本地 LLM 是否有一个明确的支持计划？

计划在1月份开始做

Dec 14 '23 01:12 arvinxx

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Does the local LLM have a clear support plan?

Plan to start in January

Dec 14 '23 01:12 lobehubbot

any plans to add support for mistral?

Dec 18 '23 05:12 hazelnutcloud

yes

Dec 18 '23 05:12 arvinxx

推荐 https://github.com/xorbitsai/inference

Dec 27 '23 15:12 structure-charger

我是用了 2023-12-01-preview 版本的API调用 Azure 的 gpt-4-vision 模型，可以成功获得回复，但是回复一句之后就会直接截断控制台输出

Jan 03 '24 11:01 WayneShao

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

I used the 2023-12-01-preview version of the API to call Azure's gpt-4-vision model. I could successfully get a reply, but the reply would be truncated directly after one sentence. console output

Jan 03 '24 11:01 lobehubbot

我是用了 2023-12-01-preview 版本的API调用 Azure 的 gpt-4-vision 模型，可以成功获得回复，但是回复一句之后就会直接截断

控制台输出

这个应该是 gpt-4-vision 内置的max_tokens 太小了导致的

Jan 03 '24 11:01 arvinxx

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

I used the 2023-12-01-preview version of the API to call Azure's gpt-4-vision model. I could successfully get a reply, but the reply would be truncated directly after one sentence.

Console output

This should be caused by the built-in max_tokens of gpt-4-vision being too small.

Jan 03 '24 11:01 lobehubbot

是否有Gemini Pro的计划

Jan 22 '24 09:01 toneywu

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Are there any plans for Gemini Pro?

Jan 22 '24 09:01 lobehubbot

是否有计划适配 ollama

Jan 23 '24 16:01 gavinliu

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Are there any plans to adapt to ollama?

Jan 23 '24 16:01 lobehubbot

Are there any plans for Gemini Pro? Thank you

Jan 27 '24 00:01 Met-Du

独立 LLM 设置 Tab，为上层助手角色提供丰富的大模型选项。

清单：

[x] Azure OpenAI [Request] 支持 Azure OpenAI #131

[ ] Cluade [Request] 是否可以支持Claude #83

[ ] Replacaite LLM https://replicate.com/

[ ] 本地 LLM - LocalAI

[ ] LLaMA

[ ] Vinuna

[ ] ChatGLM 6B

[ ] Qwen-7B

[ ] ToolLLaMA

[ ] mistral

完整 RFC 在： #737

Hi Team! There is a very simple simple way of running models locally with ollama https://github.com/ollama/ollama and wrap it up with https://github.com/BerriAI/litellm and using https://github.com/BerriAI/liteLLM-proxy it shouldn't be too difficult to make it work with a lot of OS LLM. Best Regards

Jan 30 '24 10:01 cocobeach

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Are there any plans to adapt to ollama?

ollama wrapped in litellm should be a piece of cake, I am going to try it.

Jan 30 '24 10:01 cocobeach

@cocobeach Thanks for your introduction！ We will add ollama support also~ 😁

Jan 30 '24 11:01 arvinxx

Are thers any plants to add a key pool and calling weight to dynamic call those keys?

Feb 01 '24 07:02 j0ole

Are thers any plants to add a key pool and calling weight to dynamic call those keys?

No. There is not plan about key polling.

Feb 02 '24 01:02 arvinxx

有没有可能，直接接入 langchain 内 chat model 的支持？

Feb 05 '24 07:02 zhangheli

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Is it possible to directly access the chat model support in langchain?

Feb 05 '24 07:02 lobehubbot

@zhangheli 试过了，不行：https://github.com/lobehub/lobe-chat/discussions/737#discussioncomment-8336499

Feb 05 '24 07:02 arvinxx

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

@zhangheli tried it, but it didn’t work: https://github.com/lobehub/lobe-chat/discussions/737#discussioncomment-8336499

Feb 05 '24 07:02 lobehubbot

@cocobeach Thanks for your introduction！ We will add ollama support also~ 😁

That's great, it would allow Mac users to use the interface with local models easily.

Also for the best inference but that's for nvidia therefore mostly windows station vLLM is quite unique in the way it manages gpu memory, for instance for me it's the only setup that allows me to actually take advantage of my dual RTX a4000 setup but I can run Mixtral on it quite smoothly and with excellent results. Now that gpt4 got "lazy" I think there is a good opening in the market for open source models, if they are managed properly.

I really like your platform, but I don't understand why the total amount of tokens allowed for a whole chat thread is limited to the model's context windows. Meaning if I am using gpt 3.5 turbo 16k, I get to chat up to 16k thread size, then it gives me an error? Can I configure that differently? Best Regards

Feb 05 '24 07:02 cocobeach

@cocobeach you can set a limit of current topic context. We don't set a limit by default.

Feb 05 '24 09:02 arvinxx

lobe-chat lobe-chat copied to clipboard

[Request] LLM 适配

lobe-chat
lobe-chat copied to clipboard