lobe-chat
lobe-chat copied to clipboard
[Request] LLM ้้
็ฌ็ซ LLM ่ฎพ็ฝฎ Tab๏ผไธบไธๅฑๅฉๆ่ง่ฒๆไพไธฐๅฏ็ๅคงๆจกๅ้้กนใ ่งฃๅณ็ฑปไผผไธ้ข่ฟ็ง้ฎ้ข๏ผ
- https://github.com/Yidadaa/ChatGPT-Next-Web/issues/2506
- https://github.com/Yidadaa/ChatGPT-Next-Web/issues/371
ๆธ ๅ๏ผ
Would like to add OpenAI's fine-tuned model call
Would like to add OpenAI's fine-tuned model call
Yes, with custom model name
ๆฌๅฐ LLM ๆฏๅฆๆไธไธชๆ็กฎ็ๆฏๆ่ฎกๅ๏ผ
Bot detected the issue body's language is not English, translate it automatically. ๐ฏ๐ญ๐ป๐งโ๐คโ๐ง๐ซ๐ง๐ฟโ๐คโ๐ง๐ป๐ฉ๐พโ๐คโ๐จ๐ฟ๐ฌ๐ฟ
Does the local LLM have a clear support plan?
Google's Gemini Pro API is also available now. Please see the official tweet.
ๆฌๅฐ LLM ๆฏๅฆๆไธไธชๆ็กฎ็ๆฏๆ่ฎกๅ๏ผ
่ฎกๅๅจ1ๆไปฝๅผๅงๅ
Bot detected the issue body's language is not English, translate it automatically. ๐ฏ๐ญ๐ป๐งโ๐คโ๐ง๐ซ๐ง๐ฟโ๐คโ๐ง๐ป๐ฉ๐พโ๐คโ๐จ๐ฟ๐ฌ๐ฟ
Does the local LLM have a clear support plan?
Plan to start in January
any plans to add support for mistral?
yes
ๆจ่ https://github.com/xorbitsai/inference
ๆๆฏ็จไบ 2023-12-01-preview ็ๆฌ็API่ฐ็จ Azure ็ gpt-4-vision ๆจกๅ๏ผๅฏไปฅๆๅ่ทๅพๅๅค๏ผไฝๆฏๅๅคไธๅฅไนๅๅฐฑไผ็ดๆฅๆชๆญ
ๆงๅถๅฐ่พๅบ
Bot detected the issue body's language is not English, translate it automatically. ๐ฏ๐ญ๐ป๐งโ๐คโ๐ง๐ซ๐ง๐ฟโ๐คโ๐ง๐ป๐ฉ๐พโ๐คโ๐จ๐ฟ๐ฌ๐ฟ
I used the 2023-12-01-preview version of the API to call Azure's gpt-4-vision model. I could successfully get a reply, but the reply would be truncated directly after one sentence.
console output
ๆๆฏ็จไบ 2023-12-01-preview ็ๆฌ็API่ฐ็จ Azure ็ gpt-4-vision ๆจกๅ๏ผๅฏไปฅๆๅ่ทๅพๅๅค๏ผไฝๆฏๅๅคไธๅฅไนๅๅฐฑไผ็ดๆฅๆชๆญ
ๆงๅถๅฐ่พๅบ
่ฟไธชๅบ่ฏฅๆฏ gpt-4-vision ๅ ็ฝฎ็max_tokens ๅคชๅฐไบๅฏผ่ด็
Bot detected the issue body's language is not English, translate it automatically. ๐ฏ๐ญ๐ป๐งโ๐คโ๐ง๐ซ๐ง๐ฟโ๐คโ๐ง๐ป๐ฉ๐พโ๐คโ๐จ๐ฟ๐ฌ๐ฟ
I used the 2023-12-01-preview version of the API to call Azure's gpt-4-vision model. I could successfully get a reply, but the reply would be truncated directly after one sentence.
Console output
This should be caused by the built-in max_tokens of gpt-4-vision being too small.
ๆฏๅฆๆGemini Pro็่ฎกๅ
Bot detected the issue body's language is not English, translate it automatically. ๐ฏ๐ญ๐ป๐งโ๐คโ๐ง๐ซ๐ง๐ฟโ๐คโ๐ง๐ป๐ฉ๐พโ๐คโ๐จ๐ฟ๐ฌ๐ฟ
Are there any plans for Gemini Pro?
ๆฏๅฆๆ่ฎกๅ้้ ollama
Bot detected the issue body's language is not English, translate it automatically. ๐ฏ๐ญ๐ป๐งโ๐คโ๐ง๐ซ๐ง๐ฟโ๐คโ๐ง๐ป๐ฉ๐พโ๐คโ๐จ๐ฟ๐ฌ๐ฟ
Are there any plans to adapt to ollama?
Are there any plans for Gemini Pro? Thank you
็ฌ็ซ LLM ่ฎพ็ฝฎ Tab๏ผไธบไธๅฑๅฉๆ่ง่ฒๆไพไธฐๅฏ็ๅคงๆจกๅ้้กนใ
ๆธ ๅ๏ผ
[x] Azure OpenAI [Request] ๆฏๆ Azure OpenAIย #131
[ ] Replacaite LLM https://replicate.com/
[ ] ๆฌๅฐ LLM - LocalAI
- [ ] LLaMA
- [ ] Vinuna
- [ ] ChatGLM 6B
- [ ] Qwen-7B
- [ ] ToolLLaMA
- [ ] mistral
ๅฎๆด RFC ๅจ๏ผ #737
Hi Team! There is a very simple simple way of running models locally with ollama https://github.com/ollama/ollama and wrap it up with https://github.com/BerriAI/litellm and using https://github.com/BerriAI/liteLLM-proxy it shouldn't be too difficult to make it work with a lot of OS LLM. Best Regards
Bot detected the issue body's language is not English, translate it automatically. ๐ฏ๐ญ๐ป๐งโ๐คโ๐ง๐ซ๐ง๐ฟโ๐คโ๐ง๐ป๐ฉ๐พโ๐คโ๐จ๐ฟ๐ฌ๐ฟ
Are there any plans to adapt to ollama?
ollama wrapped in litellm should be a piece of cake, I am going to try it.
@cocobeach Thanks for your introduction๏ผ We will add ollama support also~ ๐
Are thers any plants to add a key pool and calling weight to dynamic call those keys?
Are thers any plants to add a key pool and calling weight to dynamic call those keys?
No. There is not plan about key polling.
ๆๆฒกๆๅฏ่ฝ๏ผ็ดๆฅๆฅๅ ฅ langchain ๅ chat model ็ๆฏๆ ๏ผ
Bot detected the issue body's language is not English, translate it automatically. ๐ฏ๐ญ๐ป๐งโ๐คโ๐ง๐ซ๐ง๐ฟโ๐คโ๐ง๐ป๐ฉ๐พโ๐คโ๐จ๐ฟ๐ฌ๐ฟ
Is it possible to directly access the chat model support in langchain?
@zhangheli ่ฏ่ฟไบ๏ผไธ่ก๏ผhttps://github.com/lobehub/lobe-chat/discussions/737#discussioncomment-8336499
Bot detected the issue body's language is not English, translate it automatically. ๐ฏ๐ญ๐ป๐งโ๐คโ๐ง๐ซ๐ง๐ฟโ๐คโ๐ง๐ป๐ฉ๐พโ๐คโ๐จ๐ฟ๐ฌ๐ฟ
@zhangheli tried it, but it didnโt work: https://github.com/lobehub/lobe-chat/discussions/737#discussioncomment-8336499
@cocobeach Thanks for your introduction๏ผ We will add ollama support also~ ๐
That's great, it would allow Mac users to use the interface with local models easily.
Also for the best inference but that's for nvidia therefore mostly windows station vLLM is quite unique in the way it manages gpu memory, for instance for me it's the only setup that allows me to actually take advantage of my dual RTX a4000 setup but I can run Mixtral on it quite smoothly and with excellent results. Now that gpt4 got "lazy" I think there is a good opening in the market for open source models, if they are managed properly.
I really like your platform, but I don't understand why the total amount of tokens allowed for a whole chat thread is limited to the model's context windows. Meaning if I am using gpt 3.5 turbo 16k, I get to chat up to 16k thread size, then it gives me an error? Can I configure that differently? Best Regards
@cocobeach you can set a limit of current topic context. We don't set a limit by default.