johnbrisbin
johnbrisbin
The values for embedding size and overlap are in both ingest.py and privategpt.py ``` chunk_size = 500 chunk_overlap = 50 ``` That is in bytes not tokens. in privategpt.py you...
MultiRetrievalQAChain imports ChatOpenAI for no apparent reason in this line (16): from langchain.chat_models import ChatOpenAI This errant import creates a chain of errors at runtime if no OpenAI modules are...
Has there been a high level decision that all generated python scripts shall run in Docker environment? If not this is just documenting an introduced bug (assuming Docker installed). If...
I encountered the same issue (too many tokens) in a short Arabic passage in the PaLM 2 Technical Report pdf, published by Google recently where they extoll how good it...
Once adding new documents without having to reload all is working reliably, periodic persistence of the db would become an effective way of avoiding massive loss of effort when a...
You may find that any performance improvement you see will be very dependent on the type of media where the document files are located. That is, attempts to read multiple...
I tend to like the idea of some kind on installer not because it is difficult right now (pretty close though) but because it will only get more complex. Keeping...
Modal text interfaces are a pain, but I would take the traditional ^C over starting to pick commands out of innocent text. Soon enough there will be proper UI to...
> chunk_size 500 requires too much memory, even 32GB can't fit in it, change to 200, which works fine on 16GB macbook m1 As you ingest more data you will...
That looks great. I have a few questions though. 1. Rather than scraping nvidia-smi, have you considered using pycuda? It is simple to get free memory as a plain number...