Flowise FEATURE: Add Multi Modal Capabilities to Flowise

Dec 21 '23 04:12 vinodkiran

Update after touchup:

Jan 17 '24 00:01 HenryHengZJ

Hi, it seems like only GPT-4 Vision will be added in this pull. Will Ollama+LLaVA be added in the future? Ollama itself already supports LLaVA, but the chain does not. I wish this can be added along.

Jan 23 '24 07:01 treesheeptw

Hi, it seems like only GPT-4 Vision will be added in this pull. Will Ollama+LLaVA be added in the future? Ollama itself already supports LLaVA, but the chain does not. I wish this can be added along.

we'll first roll out chatopenai first, and move on to ollama

Jan 24 '24 15:01 HenryHengZJ

@vinodkiran pushed a couple of UI fixes:

error message when audio recording is not supported
messages not autoscrolling to the bottom

Jan 30 '24 10:01 0xi4o

@vinodkiran @HenryHengZJ Made a couple of changes:

Change the UI for Speech to text configuration
Made messages in view message dialog consistent with internal chat

Feb 19 '24 10:02 0xi4o

Couple more updates:

Removed the status indicator in speech to text dialog
When submitting audio inputs, user messages will be updated (in the frontend) with the transcribed question using the selected speech to text provider. This was already available when there was only audio input but it only showed on refresh or when closing and opening the chat window. It will now show immediately after getting the response from the backend and it will now work even with multiple uploads (like images w/ audio).

Feb 19 '24 14:02 0xi4o

related chat embedded PR

Feb 21 '24 18:02 HenryHengZJ

@HenryHengZJ @chungyau97 Issues have all been fixed. We can review and merge this. I think everything's good to go on @vinodkiran's end too.

Feb 22 '24 10:02 0xi4o

@0xi4o, thanks for the fix for invalid characters in file 2024-02-21 20_43_24-Elon Musk - Elon Musk.pdf — Mozilla Firefox.png

Feb 24 '24 04:02 chungyau97

Another error where importing flows then turn on Speech To Text will remove file and mic logo.

Steps to reproduce error:

Click Add New
Import MultiModal chatflow
Save chatflow
Turn on Speech To Text

Feb 24 '24 04:02 chungyau97

Another issue:

1.) Open OpenAI Whisper, put in credential:

2.) switch to assembly ai, you can see the openai credential there:

I think that's because of the credentialNames in the useEffect, I've removed that as it caused infinite loop, but you were trying to put it there to prevent this scenario right

Feb 24 '24 05:02 HenryHengZJ

Another error where importing flows then turn on Speech To Text will remove file and mic logo.

Steps to reproduce error:

Click Add New

Import MultiModal chatflow

Save chatflow

Turn on Speech To Text

solved

Feb 24 '24 07:02 HenryHengZJ

Another issue:

1.) Open OpenAI Whisper, put in credential:

2.) switch to assembly ai, you can see the openai credential there:

I think that's because of the credentialNames in the useEffect, I've removed that as it caused infinite loop, but you were trying to put it there to prevent this scenario right

solved

Feb 24 '24 07:02 HenryHengZJ

Could there be an option to configure another audio or image model (selfhosted)?

Feb 28 '24 17:02 HermesMacedo

Awesome feature. It would be great to add it to the chat embed as well.

Mar 06 '24 14:03 nitromir

Flowise Flowise copied to clipboard

FEATURE: Add Multi Modal Capabilities to Flowise

Flowise
Flowise copied to clipboard