CASALIOY
CASALIOY copied to clipboard
♾️ toolkit for air-gapped LLMs on consumer-grade hardware
@hippalectryon-0 introduced HF text embeddings with #45. May you - if it fits you well - elaborate how [this](https://github.com/marella/ctransformers#langchain) performs? Edit: missing embeddings port
### Feature request using either [guidance](https://github.com/microsoft/guidance) or [lmql](https://github.com/eth-sri/lmql), use a better prompt template NOTE: they don't support ggml yet, see https://github.com/microsoft/guidance/issues/58 and https://github.com/eth-sri/lmql/pull/18. I'm just opening the issue to avoid...
### Issue you'd like to raise. I'm utilizing a Portuguese PDF file and presenting questions in Portuguese. However, there are instances when the answer is accurate but in English. Is...
Hi, Thanks for the contribution. I have been using your repository to train a model on a collection of books. My goal is to generate answers that are specific to...
### Feature request Ability to set the output to [Sources, Question, Answer] instead of [Question, Answer, Sources]. ### Motivation In my use, it is easier to see the generated response...
### .env ``` # Generic TEXT_EMBEDDINGS_MODEL=sentence-transformers/all-MiniLM-L6-v2 TEXT_EMBEDDINGS_MODEL_TYPE=HF # LlamaCpp or HF USE_MLOCK=true # Ingestion PERSIST_DIRECTORY=db DOCUMENTS_DIRECTORY=source_documents INGEST_CHUNK_SIZE=500 INGEST_CHUNK_OVERLAP=50 # Generation MODEL_TYPE=LlamaCpp # GPT4All or LlamaCpp MODEL_PATH=eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin MODEL_TEMP=0.8 MODEL_N_CTX=1024 # Max...
# Max Threads = Poor Performance on 8 thread processor and GGJT model after convert.py `TL:DR - Try setting n_threads to 6 instead of 8 if you have an 8...
For the MosaiML: haven't tried yet, feel free to create another issue so that we don't forget after closing this one Update: mpt-7b-q4_0.bin doesn't work "out of the box", it...