lobe-chat icon indicating copy to clipboard operation
lobe-chat copied to clipboard

[Request] LLM ้€‚้…

Open arvinxx opened this issue 1 year ago โ€ข 14 comments

็‹ฌ็ซ‹ LLM ่ฎพ็ฝฎ Tab๏ผŒไธบไธŠๅฑ‚ๅŠฉๆ‰‹่ง’่‰ฒๆไพ›ไธฐๅฏŒ็š„ๅคงๆจกๅž‹้€‰้กนใ€‚ ่งฃๅ†ณ็ฑปไผผไธ‹้ข่ฟ™็ง้—ฎ้ข˜๏ผš

  • https://github.com/Yidadaa/ChatGPT-Next-Web/issues/2506
  • https://github.com/Yidadaa/ChatGPT-Next-Web/issues/371

ๆธ…ๅ•๏ผš

  • [x] Azure OpenAI https://github.com/lobehub/lobe-chat/issues/131
  • [ ] Cluade https://github.com/lobehub/lobe-chat/issues/83
  • [ ] Replacaite LLM https://replicate.com/
  • [ ] ๆœฌๅœฐ LLM - LocalAI

arvinxx avatar Sep 06 '23 12:09 arvinxx

Would like to add OpenAI's fine-tuned model call

ddwinhzy avatar Oct 11 '23 11:10 ddwinhzy

Would like to add OpenAI's fine-tuned model call

Yes, with custom model name

arvinxx avatar Oct 11 '23 11:10 arvinxx

ๆœฌๅœฐ LLM ๆ˜ฏๅฆๆœ‰ไธ€ไธชๆ˜Ž็กฎ็š„ๆ”ฏๆŒ่ฎกๅˆ’๏ผŸ

gavinliu avatar Dec 13 '23 15:12 gavinliu

Bot detected the issue body's language is not English, translate it automatically. ๐Ÿ‘ฏ๐Ÿ‘ญ๐Ÿป๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘๐Ÿ‘ซ๐Ÿง‘๐Ÿฟโ€๐Ÿคโ€๐Ÿง‘๐Ÿป๐Ÿ‘ฉ๐Ÿพโ€๐Ÿคโ€๐Ÿ‘จ๐Ÿฟ๐Ÿ‘ฌ๐Ÿฟ


Does the local LLM have a clear support plan?

lobehubbot avatar Dec 13 '23 15:12 lobehubbot

Google's Gemini Pro API is also available now. Please see the official tweet.

jyboy avatar Dec 13 '23 16:12 jyboy

ๆœฌๅœฐ LLM ๆ˜ฏๅฆๆœ‰ไธ€ไธชๆ˜Ž็กฎ็š„ๆ”ฏๆŒ่ฎกๅˆ’๏ผŸ

่ฎกๅˆ’ๅœจ1ๆœˆไปฝๅผ€ๅง‹ๅš

arvinxx avatar Dec 14 '23 01:12 arvinxx

Bot detected the issue body's language is not English, translate it automatically. ๐Ÿ‘ฏ๐Ÿ‘ญ๐Ÿป๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘๐Ÿ‘ซ๐Ÿง‘๐Ÿฟโ€๐Ÿคโ€๐Ÿง‘๐Ÿป๐Ÿ‘ฉ๐Ÿพโ€๐Ÿคโ€๐Ÿ‘จ๐Ÿฟ๐Ÿ‘ฌ๐Ÿฟ


Does the local LLM have a clear support plan?

Plan to start in January

lobehubbot avatar Dec 14 '23 01:12 lobehubbot

any plans to add support for mistral?

hazelnutcloud avatar Dec 18 '23 05:12 hazelnutcloud

yes

arvinxx avatar Dec 18 '23 05:12 arvinxx

ๆŽจ่ https://github.com/xorbitsai/inference

structure-charger avatar Dec 27 '23 15:12 structure-charger

ๆˆ‘ๆ˜ฏ็”จไบ† 2023-12-01-preview ็‰ˆๆœฌ็š„API่ฐƒ็”จ Azure ็š„ gpt-4-vision ๆจกๅž‹๏ผŒๅฏไปฅๆˆๅŠŸ่Žทๅพ—ๅ›žๅค๏ผŒไฝ†ๆ˜ฏๅ›žๅคไธ€ๅฅไน‹ๅŽๅฐฑไผš็›ดๆŽฅๆˆชๆ–ญ image ๆŽงๅˆถๅฐ่พ“ๅ‡บ image

WayneShao avatar Jan 03 '24 11:01 WayneShao

Bot detected the issue body's language is not English, translate it automatically. ๐Ÿ‘ฏ๐Ÿ‘ญ๐Ÿป๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘๐Ÿ‘ซ๐Ÿง‘๐Ÿฟโ€๐Ÿคโ€๐Ÿง‘๐Ÿป๐Ÿ‘ฉ๐Ÿพโ€๐Ÿคโ€๐Ÿ‘จ๐Ÿฟ๐Ÿ‘ฌ๐Ÿฟ


I used the 2023-12-01-preview version of the API to call Azure's gpt-4-vision model. I could successfully get a reply, but the reply would be truncated directly after one sentence. image console output image

lobehubbot avatar Jan 03 '24 11:01 lobehubbot

ๆˆ‘ๆ˜ฏ็”จไบ† 2023-12-01-preview ็‰ˆๆœฌ็š„API่ฐƒ็”จ Azure ็š„ gpt-4-vision ๆจกๅž‹๏ผŒๅฏไปฅๆˆๅŠŸ่Žทๅพ—ๅ›žๅค๏ผŒไฝ†ๆ˜ฏๅ›žๅคไธ€ๅฅไน‹ๅŽๅฐฑไผš็›ดๆŽฅๆˆชๆ–ญ

image

ๆŽงๅˆถๅฐ่พ“ๅ‡บ

image

่ฟ™ไธชๅบ”่ฏฅๆ˜ฏ gpt-4-vision ๅ†…็ฝฎ็š„max_tokens ๅคชๅฐไบ†ๅฏผ่‡ด็š„

arvinxx avatar Jan 03 '24 11:01 arvinxx

Bot detected the issue body's language is not English, translate it automatically. ๐Ÿ‘ฏ๐Ÿ‘ญ๐Ÿป๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘๐Ÿ‘ซ๐Ÿง‘๐Ÿฟโ€๐Ÿคโ€๐Ÿง‘๐Ÿป๐Ÿ‘ฉ๐Ÿพโ€๐Ÿคโ€๐Ÿ‘จ๐Ÿฟ๐Ÿ‘ฌ๐Ÿฟ


I used the 2023-12-01-preview version of the API to call Azure's gpt-4-vision model. I could successfully get a reply, but the reply would be truncated directly after one sentence.

image

Console output

image

This should be caused by the built-in max_tokens of gpt-4-vision being too small.

lobehubbot avatar Jan 03 '24 11:01 lobehubbot

ๆ˜ฏๅฆๆœ‰Gemini Pro็š„่ฎกๅˆ’

toneywu avatar Jan 22 '24 09:01 toneywu

Bot detected the issue body's language is not English, translate it automatically. ๐Ÿ‘ฏ๐Ÿ‘ญ๐Ÿป๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘๐Ÿ‘ซ๐Ÿง‘๐Ÿฟโ€๐Ÿคโ€๐Ÿง‘๐Ÿป๐Ÿ‘ฉ๐Ÿพโ€๐Ÿคโ€๐Ÿ‘จ๐Ÿฟ๐Ÿ‘ฌ๐Ÿฟ


Are there any plans for Gemini Pro?

lobehubbot avatar Jan 22 '24 09:01 lobehubbot

ๆ˜ฏๅฆๆœ‰่ฎกๅˆ’้€‚้… ollama

gavinliu avatar Jan 23 '24 16:01 gavinliu

Bot detected the issue body's language is not English, translate it automatically. ๐Ÿ‘ฏ๐Ÿ‘ญ๐Ÿป๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘๐Ÿ‘ซ๐Ÿง‘๐Ÿฟโ€๐Ÿคโ€๐Ÿง‘๐Ÿป๐Ÿ‘ฉ๐Ÿพโ€๐Ÿคโ€๐Ÿ‘จ๐Ÿฟ๐Ÿ‘ฌ๐Ÿฟ


Are there any plans to adapt to ollama?

lobehubbot avatar Jan 23 '24 16:01 lobehubbot

Are there any plans for Gemini Pro? Thank you

Met-Du avatar Jan 27 '24 00:01 Met-Du

็‹ฌ็ซ‹ LLM ่ฎพ็ฝฎ Tab๏ผŒไธบไธŠๅฑ‚ๅŠฉๆ‰‹่ง’่‰ฒๆไพ›ไธฐๅฏŒ็š„ๅคงๆจกๅž‹้€‰้กนใ€‚

ๆธ…ๅ•๏ผš

ๅฎŒๆ•ด RFC ๅœจ๏ผš #737

Hi Team! There is a very simple simple way of running models locally with ollama https://github.com/ollama/ollama and wrap it up with https://github.com/BerriAI/litellm and using https://github.com/BerriAI/liteLLM-proxy it shouldn't be too difficult to make it work with a lot of OS LLM. Best Regards

cocobeach avatar Jan 30 '24 10:01 cocobeach

Bot detected the issue body's language is not English, translate it automatically. ๐Ÿ‘ฏ๐Ÿ‘ญ๐Ÿป๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘๐Ÿ‘ซ๐Ÿง‘๐Ÿฟโ€๐Ÿคโ€๐Ÿง‘๐Ÿป๐Ÿ‘ฉ๐Ÿพโ€๐Ÿคโ€๐Ÿ‘จ๐Ÿฟ๐Ÿ‘ฌ๐Ÿฟ

Are there any plans to adapt to ollama?

ollama wrapped in litellm should be a piece of cake, I am going to try it.

cocobeach avatar Jan 30 '24 10:01 cocobeach

@cocobeach Thanks for your introduction๏ผ We will add ollama support also~ ๐Ÿ˜

arvinxx avatar Jan 30 '24 11:01 arvinxx

Are thers any plants to add a key pool and calling weight to dynamic call those keys?

j0ole avatar Feb 01 '24 07:02 j0ole

Are thers any plants to add a key pool and calling weight to dynamic call those keys?

No. There is not plan about key polling.

arvinxx avatar Feb 02 '24 01:02 arvinxx

ๆœ‰ๆฒกๆœ‰ๅฏ่ƒฝ๏ผŒ็›ดๆŽฅๆŽฅๅ…ฅ langchain ๅ†… chat model ็š„ๆ”ฏๆŒ ๏ผŸ

zhangheli avatar Feb 05 '24 07:02 zhangheli

Bot detected the issue body's language is not English, translate it automatically. ๐Ÿ‘ฏ๐Ÿ‘ญ๐Ÿป๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘๐Ÿ‘ซ๐Ÿง‘๐Ÿฟโ€๐Ÿคโ€๐Ÿง‘๐Ÿป๐Ÿ‘ฉ๐Ÿพโ€๐Ÿคโ€๐Ÿ‘จ๐Ÿฟ๐Ÿ‘ฌ๐Ÿฟ


Is it possible to directly access the chat model support in langchain?

lobehubbot avatar Feb 05 '24 07:02 lobehubbot

@zhangheli ่ฏ•่ฟ‡ไบ†๏ผŒไธ่กŒ๏ผšhttps://github.com/lobehub/lobe-chat/discussions/737#discussioncomment-8336499

arvinxx avatar Feb 05 '24 07:02 arvinxx

Bot detected the issue body's language is not English, translate it automatically. ๐Ÿ‘ฏ๐Ÿ‘ญ๐Ÿป๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘๐Ÿ‘ซ๐Ÿง‘๐Ÿฟโ€๐Ÿคโ€๐Ÿง‘๐Ÿป๐Ÿ‘ฉ๐Ÿพโ€๐Ÿคโ€๐Ÿ‘จ๐Ÿฟ๐Ÿ‘ฌ๐Ÿฟ


@zhangheli tried it, but it didnโ€™t work: https://github.com/lobehub/lobe-chat/discussions/737#discussioncomment-8336499

lobehubbot avatar Feb 05 '24 07:02 lobehubbot

@cocobeach Thanks for your introduction๏ผ We will add ollama support also~ ๐Ÿ˜

That's great, it would allow Mac users to use the interface with local models easily.

Also for the best inference but that's for nvidia therefore mostly windows station vLLM is quite unique in the way it manages gpu memory, for instance for me it's the only setup that allows me to actually take advantage of my dual RTX a4000 setup but I can run Mixtral on it quite smoothly and with excellent results. Now that gpt4 got "lazy" I think there is a good opening in the market for open source models, if they are managed properly.

I really like your platform, but I don't understand why the total amount of tokens allowed for a whole chat thread is limited to the model's context windows. Meaning if I am using gpt 3.5 turbo 16k, I get to chat up to 16k thread size, then it gives me an error? Can I configure that differently? Best Regards

cocobeach avatar Feb 05 '24 07:02 cocobeach

@cocobeach you can set a limit of current topic context. We don't set a limit by default. image

arvinxx avatar Feb 05 '24 09:02 arvinxx