spend too much time waiting for the answer?any method can speed up?
each time I ask the program.it will take me nearly 5-8minites to give me the simple answer.and the text I use is just the demo.So a simple demo need so much time to give me the answer??? Any method can provide to speed up the program?
You can try to change the embedding and llm model
You can use qdrant as a vector store instead. It's way faster because it supports maximal marginal relevance. Also change your model. My repo supports models outside the GPT4ALL scope
Like this
If you are talking about the ingest file, try editing the text document to something like "My name is Nick". Ingestion should be pretty quick on it.
Made a version of the ingest where it only does ingestion if the source files have changed. Also note that my privateGPT file calls the ingest file at each run and checks if the db needs updating.
P.S. I ran a couple giant survival guide PDFs through the ingest and waited like 12 hours, still wasnt done so I cancelled it to clear up my ram. Would like to see a loading bar for ingestion one day. (Need more setup for pdfs, hopefully someone comes up with an easier way for windows)
@alxspiker feel free to pr your ingest.py to my repo. I try to speed up both by utilizing qdrant and llama native models. Already getting to <50ms per token with a i5-9600k
@alxspiker feel free to pr your ingest.py to my repo. I try to speed up both by utilizing qdrant and llama native models. Already getting to <50ms per token with a i5-9600k
I am going to work on it right now and send it over. Currently mine supports PDFs and Images but setup is a nightmare on windows IMO. I'll send over a version that just supports the current TextLoader.
why is it nightmare? your env? or handling docs.
Sounds good, see you over there. Currently wrapping up my ingest.py that iterates through the document dir and for each file type runs ingest. So PDF, JSON, CSV should be fine. Everything else we'll convert to txt
Great work @su77ungr @alxspiker ! Feel free to PR your changes to this repo if you feel like it, or just share your results here.
My initial tests with qdrant didn't really improve perf, but I'll do more testing.
why is it nightmare? your env? or handling docs.
Sounds good, see you over there. Currently wrapping up my ingest.py that iterates through the document dir and for each file type runs ingest. So PDF, JSON, CSV should be fine. Everything else we'll convert to txt
I had to install poppler manually and define its path which was hard for me to find the documentation just for page numbers and tesseract-ocr.exe for windows to read the images.
I used DirectoryLoader and ran into all these other require packages like: unstructured pdf2image pytesseract tesseract tesseract-ocr detectron2 git+https://github.com/facebookresearch/detectron2.git unstructured[local-inference]
Got it working but I didn't like the setup.
Great work @su77ungr @alxspiker ! Feel free to PR your changes to this repo if you feel like it, or just share your results here.
My initial tests with qdrant didn't really improve perf, but I'll do more testing.
Seems to be a little quicker on ingestion for me @ 471.16 ms per token Not sure if I notice a difference on LLM side of things @ 4533.13 ms per token
Ill do some more personal tests later.
I am also facing the same issue, and I have been running for more than 24 hours.
why is it nightmare? your env? or handling docs.
Sounds good, see you over there. Currently wrapping up my ingest.py that iterates through the document dir and for each file type runs ingest. So PDF, JSON, CSV should be fine. Everything else we'll convert to txt
JSON by default is not in the supported extensions. You manually had to add support for it??