private-gpt icon indicating copy to clipboard operation
private-gpt copied to clipboard

Future ideas

Open walking-octopus opened this issue 2 years ago • 2 comments
trafficstars

Introduction

I think this is a wonderful little project! I think we can use this issue to discuss several ways we can turn this from a promising demo into a full polished tool.

Most of these points are heavily inspired by Tailwind demo presented at Google IO. Perhaps if we're fast enough, we might even outpace Google, given that they seem to be quite partial to overdue betas and interminable wait-lists.

Suggestions:

  • Web, desktop, and mobile interfaces: The overall idea can be quite similar to Tailwind. Users could write or upload text files, with perhaps a possibility of auto-indexing arbitrary directories on the server, including virtual file systems, or cloud storage, sort them into folders or tags (perhaps with an option of automatic LLM-based grouping), and ask questions within these folders or for the entire account. Other features, such as quizes, todo lists, and so on, can too be considered.

  • Client, Server architecture: The app can be split into two parts, the server running on a cloud instance, powerful computer, server, or the same device as the interface, and a client communicating to it over a REST API or a WebSocket. Clients can be platform specific, web, or command-line, being quite small and easy to maintain, while the server can be free to perform background tasks and focus on performance.

  • Multi-user: Perhaps a simple user account system could be used to let multiple users work on the same server, perhaps with support for shared folders. Additionally, admin tools to monitor the amount of space used, user disk and rate-limit/token quotas, indexing jobs, and other such things would be welcome. Such features can also allow this app to act as an enterprise knowledge-base or just make it easier to share the same server with friends.

  • Arbitrary document ingestion: Perhaps adding plugins to capture more types of information could be promising. Transcribing and paragraphizing audio, parsing link lists into well-formatted summarized documents, etc.

  • Topic detection, clustering: Perhaps a visual representation of each document in a vector space could be an intuitive way to communicate how they are searched. It can also be a useful alternative to manual tagging or links that regular note-taking apps use to organize large knowage-bases.

  • Performance: I think for this type of project, performance remains critical, so there's quite a few considerations in this area. Perhaps the server can use C++ to avoid dealing with the bindings and allow the server to be one neat binary. LLaMa embedding are horribly inefficient and poor in general, so a fine-tuned BERT or instructor-xl can be used to search for text. And maybe there is some model smaller than LLaMA that can do the job of summarizing the found documents. RWKV-LM looks quite promising and maybe can be fine-tuned for this task in specific. And also, it should be looked into the actual text being embedded, perhaps splitting it by paragraphs or sentences to throw less data into the prompt of a performance-heavy LLM.

walking-octopus avatar May 11 '23 20:05 walking-octopus

I'm currently working on a similar project to realize at least the basics of this vision. For now, I'm getting the UI, DB, and soon embeddings + LLaMA to work. Here's the screenshoot the Streamlit interface.

Project Tailwind (Community)

walking-octopus avatar May 13 '23 16:05 walking-octopus

Taking @walking-octopus ' suggestion of arbitrary document injection one step further, it would be lovely to pull documents from arbitrary locations. One can easily imagine connecting your mail account or a Google Drive or Dropbox account

yoiang avatar May 30 '23 05:05 yoiang