Verba
Verba copied to clipboard
Support Indexify as Retriever
Hi folks! Love Verba, does the project support or plan to support pluggable retrievers? We are building an open-source reliable extraction and embedding engine - https://getindexify.ai We are pan on supporting Weviate as a storage backend very soon.
Indexify has a retriever API which supports retrieving using semantic search on embedding indexes, and structured data from unstructured data using SQL.
If we integrate Indexify, Verba will be able to -
- Answer questions from not only PDF and documents, but also from images, videos and audio.
- Ingest any amount of documents, videos, audio, etc at any scale (throughput, data volume)
- Extraction of embedding, structured data from videos, docs, images will be offloaded in workers (distributed in production) so retrieval will always return fresh data.
- Users can monitor the state of indexes, extraction status, delete or update ingested content and extracted embedding/metadata.
- Support all major hardware accelerators and any model for extraction.
Here is an example pipeline for PDF extraction - https://getindexify.ai/usecases/pdf_extraction/ and for videos - https://getindexify.ai/usecases/video_rag/
I think the integration could be fairly seamless with some extensions in Verba and once we support Weviate in Indexify(should be straight forward).
Thoughts?
Interesting and great idea! I'll have a look 🚀
@thomashacker I would love to chat more on discord, or on a call also! My email - [email protected] :)
Hey @diptanu if this is still open, feel free to create a PR