llamafile
llamafile copied to clipboard
Chat with local files (Retrieval Augmented Generation?)
This is a cool project and works on my old Mac (Intel) and a newer Air (M1). Well, it's slow of course due to my hardware, but so easy to get installed/running compared to all of the other ones I have tried ... thanks for that.
There is a similar project called BionicGPT (https://github.com/bionic-gpt/bionic-gpt) which is Docker based, a bit more work (not too much) to get installed/running, but it allows the upload or ingest of local documents which can be chatted/summarized/searched. Personally, I would be happy to just use plain text files; no need for .pdf, .docx, .html, etc.
Is this possible in the future for llamafile ?
Very nice idea, I had something very similar in mind: #137
llamafile provides the lower level fundamentals that are needed for file/url summarization. Please check out a blog post I recently wrote, which explains how to do this on the command line: https://justine.lol/oneliners/ For example, if you wanted to summarize a text file named ESSAY.TXT on your computer using the Mistral main llamafile, you could say:
(
echo '[INST]Summarize the following text:'
cat ESSAY.TXT
echo '[/INST]'
) | ./mistral-7b-instruct-v0.1-Q4_K_M-main.llamafile \
-c 6700 \
-f /dev/stdin \
--temp 0 \
-n 500 \
--silent-prompt 2>/dev/null
@stlhood I'll leave this open for you to comment further. We could for instance add an upload feature to the web gui.
This would be very useful.
Another alternative is something similar to what is being achieved at https://github.com/imartinez/privateGPT or https://github.com/PromtEngineer/localGPT
The ability to ingest files and then chat with them via the model. Or maybe point to a folder where all the files are stored.
I'm not familiar with the trick they use prompt/tune/lora/etc. that behavior, but it'd potentially be a fun feature to explore. I'd be interested in hearing what @stlhood thinks.
Also, maybe the popular AI chat thingies are already doing this, and I lack prompting skills, but it would be nice to have a "semantic search" (meaning) for local documents ... similar to: https://github.com/freedmand/semantra.
I suppose having an all-in-one-search solution is off in the future, and being able to flip between searches like: keyword, pattern, and semantic would be ideal. Summarization is nice and useful, but often a bit more is needed to deep dive.
The following was helpful, thank you:
llamafile provides the lower level fundamentals that are needed for file/url summarization. Please check out a blog post I recently wrote, which explains how to do this on the command line: https://justine.lol/oneliners/ For example, if you wanted to summarize a text file named ESSAY.TXT on your computer using the Mistral main llamafile, you could say:
( echo '[INST]Summarize the following text:' cat ESSAY.TXT echo '[/INST]' ) | ./mistral-7b-instruct-v0.1-Q4_K_M-main.llamafile \ -c 6700 \ -f /dev/stdin \ --temp 0 \ -n 500 \ --silent-prompt 2>/dev/null
@stlhood I'll leave this open for you to comment further. We could for instance add an upload feature to the web gui.
RAG is a powerful use for LLMs, no doubt about it, but we are not currently considering adding RAG to llamafile. Our reasoning follows:
Adding RAG functionality to llamafile would rather dramatically change the scope and intent of the project. Doing RAG well is not straightforward, as evidenced by the many excellent open source projects (like privateGPT) and startups (like LlamaIndex) that have moved into this space over the last few months. Instead of trying to cram some sort of minimal/naive RAG implementation into llamafile, we'd rather see llamafile integrated into existing RAG projects and products. This way, llamafile could help empower these projects to leverage local inference with open models (instead of relying on OpenAI).
If anyone reading this is involved in any existing RAG projects out there, please feel free to mention llamafile to their maintainer(s) -- we'd love to work with them...
I think @stlhood is right. There's a lot of people doing great work on RAG and I think the best way we can support them is by keeping llamafile focused on what it can do best, which is making LLM software easily distributable and accessible. I'm going to close this issue now. Thank you for reaching out and asking!