chat-ui
chat-ui copied to clipboard
Generalize RAG + PDF Chat feature
TLDR: implement PDF-chat feature
Closes #609
When user uploads a PDF:
- Parse the PDF text, create embeddings, save embeddings in files bucket (that is also used for saving images for multimodal models)
- On the next messages in that conversation, use the PDF embeddings for RAG
Limitations
- Parse first 20 pages of pdf (can increase it or decrease it)
- A conversation can currently have only one uploaded PDF. When a user uploads PDF, it overwrites the existing PDF if there was any
- When user enables websearch, then websearch RAG is used, PDF RAG is not used.
- Just like Websearch Rag, when Pdf rag is enabled, every message of that conversation will use PDF Rag. (In subsequent PR, we need to use prompting and other techniques that will make the tool usage only when it makes sense)
Testing locally
*install new pdf-parse dependency with npm ci
npm ci
npm run dev -- --open
Screen recording
Testing by uploading Mamba paper
https://github.com/huggingface/chat-ui/assets/11827707/acae74fc-e70e-485f-9206-e8d539d5d90a
:fire: So cool, will test it locally later, but just from looking at the demo, do you think there's an easy way to show an indicator of when a PDF is already uploaded and will be sent with the message ?
but just from looking at the demo, do you think there's an easy way to show an indicator of when a PDF is already uploaded and will be sent with the message ?
at the moment, there is websearch-like box that indicates pdf rag was used
I meant more when the file is loaded and before the conversation is started, like for images:
I guess it would look a bit different since you can only have one PDF per conversation, but it would be nice to have an indication that a PDF will be used to answer the query :eyes:
I see. Let me think about it
Small note, if I drag and drop a non-pdf file I get the following weird output
Updates:
- Fixed this upload non-img file bug here
- Provided better UI/UX for uploaded file (see attached video). Specifically: 1. name if the uploaded pdf with pdf icon appears; 2. this pdf file name and icon does "blinking" animation while the pdf is being uploaded & embeddings are being created; 3. on hover
x
btn appears, that let's you delete the uploaded PDF file - env var (config) for enabling pdf-chat feature as here
https://github.com/huggingface/chat-ui/assets/11827707/723cd355-edea-4204-98db-0632546b3cf4
What I'm working on now:
Thanks for working on this! One question I had is: what was your thought process for adding a new upload button for PDFs, versus using the existing drag-and-drop functionality that already exists for images?
@wdhorton currently the UI might still evolve. For now, the reason for resuing the same upload btn (instead of adding a new upload btn) is: having two different upload btns makes the UI look cluttered, especially on smaller screens
I have few questions:
- Why you are limiting this feature to PDF and not csv, txt, etc?
- I don't sure you have to use embedding for PDF (or at least make it optional, and not mandatory)
- Why using Mongo as vectorDB and calculate the vector similarity in client side instead of using some real vectorDB that can to it whey more efficient and fast than running it in JS, this can make the code cleaner and remove any limitation of content size
- Does storing all the embedding is good idea? this can make the DB blow relatively fast.
@itaybar, thanks a lot for your questions
Why you are limiting this feature to PDF and not csv, txt, etc?
Yes, we will add support to other text files. Once this PR is done, supporting other text files would be trivial. (Might even include as part of this PR)
I don't sure you have to use embedding for PDF (or at least make it optional, and not mandatory)
Could you elaborate on it. And what would be the alternatives?
Why using Mongo as vectorDB and calculate the vector similarity in client side instead of using some real vectorDB that can to it whey more efficient and fast than running it in JS, this can make the code cleaner and remove any limitation of content size. Does storing all the embedding is good idea? this can make the DB blow relatively fast.
Indeed this is a good point. I/we will add support for vectorDB (likely part of this PR)
Update: this PR is getting big. Unfortunately, there is no other option (I think). The specific points are:
- In #689, I've generalized RAG. What does it mean? It means two things: 1. server side. 2. frontend side. In terms of
1. server side
, RAG applications have to implement a RAG interface for consistency & better organization of codebase (you can checkout directorysrc/lib/server/rag
). In terms of2. server side
, OpenWebSearchResults svelte component is generalized into OpenRAGResults svelte component that will show up on RAG augmented messages (as suggested here). - Besides creating pdf embeddings through TEI for pdf-chat, we would need vectorDB support for multiple reasons:
- without vectorDB, the pdf-chat session will lose the pdf embeddings (for instance, when you close your browser & re-open the same chat-ui conversation, the pdf embeddings will no longer be available)
- storing PDF embeddings on mongo gridFS would slow-down performance (as questioned here) & large number of embeddings can cause a lot of latency on the server since findSimilarSentences runs locally on the server. Therefore, we would need support for vectorDB.
- We would need vectorDB for other features as well. For instance, we would need vecorDB + pdf RAG for #639. There was also internal slack discussion here.
- VectorDB support is more general feature that can have multiple applications. For instance, PDF-chat is just an instance (special case) of vectorDB chat since in PDF-chat, one uses the PDF to populate VectorDB & afterwards it becomes just chat with vectorDB
- for the info, checking openai/chatgpt-retrieval-plugin to see if we would need to follow commonly used API for vectorDBs
Should I open a PR for vectorDB support against this branch? wdyt @nsarrazin @gary149
@itaybar, thanks a lot for your questions
Why you are limiting this feature to PDF and not csv, txt, etc?
Yes, we will add support to other text files. Once this PR is done, supporting other text files would be trivial. (Might even include as part of this PR)
I don't sure you have to use embedding for PDF (or at least make it optional, and not mandatory)
Could you elaborate on it. And what would be the alternatives?
Why using Mongo as vectorDB and calculate the vector similarity in client side instead of using some real vectorDB that can to it whey more efficient and fast than running it in JS, this can make the code cleaner and remove any limitation of content size. Does storing all the embedding is good idea? this can make the DB blow relatively fast.
Indeed this is a good point. I/we will add support for vectorDB (likely part of this PR)
Thanks for the quick response guys! About the optional embedding for the pdf, correct me if I wrong, but you can just read the texts from the pdf without using embedding for the images, etc and by that removing the hard limits for content file
@mishig25 What is your estimation for merging this in your opinion?
Can't give exact date. But will do my best to merge it soon :)
Merge. hire 10 more people to help.
Merge
Will merge soon
hire 10 more people to help.
Hiring 10 more people rarely results in 2x productivity (let alone 10x)
It's a tough one to implement. I appreciate the work on this one.
I had a couple of thoughts on this I'd like to share.
First, I thought of a plugin-like system that is registered based on the file type, has handler for processing/storing
and handler for retrieval
. That would allow community to scale while not choking the HF developers which I'm beyond impressed being able to deliver such variety of products.
Alternatively, it could also be a third party API - much like OpenAI function calling works - so that the responsibility is NOT on you but on the end user who chooses to deploy. It's great for developers but would probably be a pain for the those who just one to feel the power of deploying and has no use beyond (inspired by the shutdown story of banana.dev).
I believe these would inherently fit better as the nature of open-source deployments is to customize. As such, there is an infinite number of use cases and solutions... PDFs, images, audio files, web search as input
; ChromaDB, Pinecone, Qdrant, PGVector, Meilisearch as storage/retrieval
to name a few.
Hello. I have cloned the "chatPDF" branch to use the pdf upload feature. It works locally on my machine when I run npm run dev
. However, when I perform the same thing on my apache2 server, it shows an uploading PDF error. I attaching the screenshot.
I have tried copying the exact .env.local file from my local directory to my server's directory but it still shows the same error. Should I open a new issue about it? What else can I provide to help you understand the error?
Can you ideally skip apache2? Or check error logs from it to see if anything is being logged? Could be that some headers are lost or file being too big and blocked. What does your network devtools say on this upload response?
@zubu007
The error on the console shows 403 Forbidden error meaning it has something to do with authentication. I am using custom models with separate endpoint rather than the default. However, the chat function with the custom model works as expected. When the pdf is being uploaded, it throws the 403. My question is, the pdf fetch function is using a different authentication?
Let me add further console logs in both my local machine and server to see the difference. I will let you know the results here.
I have the same problem as https://github.com/huggingface/chat-ui/issues/693 It retrieves correctly when looking at the prompt and parameters, but the model answers as if it has gotten only the question.
Let me state a problem I am facing simply.
Using the chatPDF with the default HF model ("name": "mistralai/Mistral-7B-Instruct-v0.1")
it works. But I am trying to add model to the .env.local file as an openai endpoint and use that model for the chatPDF. The chat function works with both model (default and ours) but the pdf-chat only works with the default one. With the same promt, same pdf, same parameters.
I am adding pictures for better understanding.
For HF default model, this was the promt
And this was the response
For our own endpoint,
And this was the response
Is there something I am missing? It must be something simple because changing the model in the App, one works and the other doesnt. If you need more information about the error let me know.
hey, whats the current status of the PR?
Also very interested in this feature. Is there help needed for this? Wasn't sure the blocker or how to help.
@mishig25 Thanks for working on this! Very interested in this feature as well thus curious if there are any updates it?
Hi @mishig25 and all, from my side I would also offer to help in my freetime if I can, as this is quite an important feature. Reading the above I think that an overview of the required steps is important or someone that coordinates tasks. E.g. is something missing in PR https://github.com/huggingface/chat-ui/pull/745 ? Or is a redesign of the entire RAG-logic including externalization of the API required? Or is there maybe a meeting needed with some overarching architecture etc.?