chainlit
chainlit copied to clipboard
multimodal conversation support
Is your feature request related to a problem? Please describe.
A real assistant would not only converse by text but can speak and use video / images.
Describe the solution you'd like A clear and concise description of what you want to happen.
Support text, images, audio and vidoe.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Use OpenAI's chatgpt.
Additional context Add any other context or screenshots about the feature request here
I know that multimodal ais are still a challenge for FOSS tooling.
See https://huggingface.co/vonjack/Hermes-2-Pro-BakLLaVA-Mistral-7B
Did you check https://github.com/Chainlit/cookbook/tree/main/audio-assistant ?
Hello, check the Multi-modality on the documentation to include sound and files