Verba icon indicating copy to clipboard operation
Verba copied to clipboard

Support Indexify as Retriever

Open diptanu opened this issue 1 year ago • 3 comments

Hi folks! Love Verba, does the project support or plan to support pluggable retrievers? We are building an open-source reliable extraction and embedding engine - https://getindexify.ai We are pan on supporting Weviate as a storage backend very soon.

Indexify has a retriever API which supports retrieving using semantic search on embedding indexes, and structured data from unstructured data using SQL.

If we integrate Indexify, Verba will be able to -

  1. Answer questions from not only PDF and documents, but also from images, videos and audio.
  2. Ingest any amount of documents, videos, audio, etc at any scale (throughput, data volume)
  3. Extraction of embedding, structured data from videos, docs, images will be offloaded in workers (distributed in production) so retrieval will always return fresh data.
  4. Users can monitor the state of indexes, extraction status, delete or update ingested content and extracted embedding/metadata.
  5. Support all major hardware accelerators and any model for extraction.

Here is an example pipeline for PDF extraction - https://getindexify.ai/usecases/pdf_extraction/ and for videos - https://getindexify.ai/usecases/video_rag/

I think the integration could be fairly seamless with some extensions in Verba and once we support Weviate in Indexify(should be straight forward).

Thoughts?

diptanu avatar Apr 14 '24 06:04 diptanu

Interesting and great idea! I'll have a look 🚀

thomashacker avatar Apr 15 '24 07:04 thomashacker

@thomashacker I would love to chat more on discord, or on a call also! My email - [email protected] :)

diptanu avatar Apr 15 '24 18:04 diptanu

Hey @diptanu if this is still open, feel free to create a PR

thomashacker avatar Sep 03 '24 12:09 thomashacker