jan How to set up to enable multimodal

How to set up to enable multimodal

Open zlzzzlll opened this issue 1 year ago • 3 comments

Can you tell me how to set it up so that I can drag in a picture or word document and have the model recognize it and work? Thank you for your help.

Apr 24 '24 14:04 zlzzzlll

i think you have to use a VLM like Llava which support vision. Or find a model that support image.

Apr 24 '24 16:04 iamhenry

我认为你必须使用像 Llava 这样支持视觉的 VLM。或者找一个支持的型号。 There is no option to add a document, you said Llava, can not add a document!

Apr 25 '24 12:04 zlzzzlll

@zlzzzlll, please turn on the Experimental Feature from the settings page then try Llava.

Apr 25 '24 12:04 louis-jan

Feel free to reopen if any concerns 🙏 Note: the feature only works with Image / PDF

May 07 '24 08:05 Van-QA