gpt4all
gpt4all copied to clipboard
How can I implement a custom LLM model LangChain class wrapper for GPT4All model?
Is it possible to do what described here
https://gpt-index.readthedocs.io/en/latest/how_to/customization/custom_llms.html#example-using-a-custom-llm-model
or here
https://python.langchain.com/en/latest/modules/models/llms/examples/custom_llm.html#how-to-write-a-custom-llm-wrapper
whith https://github.com/nomic-ai/gpt4all model?
I'd appreciate any help / hints!
this is being built as we speak
Lovely! I'd love to test that on my 50M collection of Q&A articles, so much potential!
Really excited for this !
seems like it is released. Version 0.0.131 👀
https://github.com/hwchase17/langchain/releases/tag/v0.0.131
Any chance there's an example how to use it? I'm looking to swap OpenAI with Gpt4all in code https://colab.research.google.com/drive/1JYTczk-4D86XNn0GTaXux5yi2-LfoIPd?usp=sharing
I'm looking at the exact same thing. I don't know how to do it yet myself. Will spend sometime on it tomorrow. If I get something out, will share here. Meanwhile, if you come across something do share, thanks! :)
It is now released. https://twitter.com/LangChainAI/status/1643261943803957249?t=N5nC6IQgfeJo6Kda2ri49w&s=19
On Tue, Apr 4, 2023, 12:08 PM Mehul @.***> wrote:
I'm looking at the exact same thing. I don't know how to do it yet myself. Will spend sometime on it tomorrow. If I get something out, will share here. Meanwhile, if you come across something do share, thanks! :)
— Reply to this email directly, view it on GitHub https://github.com/nomic-ai/gpt4all/issues/173#issuecomment-1496242512, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJ4TBWV6TQBWO6SFPASTRLW7RBQBANCNFSM6AAAAAAWP7PYNY . You are receiving this because you commented.Message ID: @.***>
@fadnavismehul this seems to be closes thing https://blog.ouseful.info/2023/04/04/running-gpt4all-on-a-mac-using-python-langchain-in-a-jupyter-notebook/ https://blog.ouseful.info/2023/04/04/langchain-query-gpt4all-against-knowledge-source/
Does anyone have a working example? I am struggling with the exception that ctx is not properly initialized when I try
docsearch = Chroma.from_documents(documents = texts, embedding = embeddings)
I load my model like this:
embeddings = LlamaCppEmbeddings(model_path=GPT4ALL_MODEL_PATH)
@sime2408, here's my tiny working test (WSL2/Ubuntu)
# https://python.langchain.com/en/latest/ecosystem/llamacpp.html
# pip uninstall -y langchain
# pip install --upgrade git+https://github.com/hwchase17/langchain.git
#
# https://abetlen.github.io/llama-cpp-python/
# pip uninstall -y llama-cpp-python
# pip install --upgrade llama-cpp-python
# pip install chromadb
#
# how to create one https://github.com/nomic-ai/pyllamacpp
import os
from langchain.chains import RetrievalQA
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.document_loaders import TextLoader
from langchain.llms import LlamaCpp
from langchain.embeddings import LlamaCppEmbeddings
GPT4ALL_MODEL_PATH = "./gpt4all-converted.bin"
def ask(question, qa):
print('\n' + question)
print(qa.run(question)+'\n\n')
persist_directory = './.chroma'
collection_name = 'data'
document_name = './test_import.txt'
llama_embeddings = LlamaCppEmbeddings(model_path=GPT4ALL_MODEL_PATH)
if not os.path.isdir(persist_directory):
print('Parsing ' + document_name)
loader = TextLoader(document_name)
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=500, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
vectordb = Chroma.from_documents(
documents=texts, embedding=llama_embeddings, collection_name=collection_name, persist_directory=persist_directory)
vectordb.persist()
print(vectordb)
print('Saved to ' + persist_directory)
else:
print('Loading ' + persist_directory)
vectordb = Chroma(persist_directory=persist_directory,
embedding_function=llama_embeddings, collection_name=collection_name)
print(vectordb)
llm = LlamaCpp(model_path=GPT4ALL_MODEL_PATH)
qa = RetrievalQA.from_chain_type(
llm=llm, chain_type="stuff", retriever=vectordb.as_retriever(search_kwargs={"k": 1}))
ask("Question1", qa);
ask("Question2", qa);
ask("Question3", qa);
Thanks @traverse-in-reverse I was trying at the end to create a custom wrapper, and also tried Vicuna model, here are some trials if of any value: vicuna colab
@traverse-in-reverse found the issue of why I couldn't run my examples, my model couldn't be initialized. I switched to GGML gpt4all-repo and now it works. Not sure why
@traverse-in-reverse any idea why I am getting below error, with same code block, which you have provided:
Error:
NoIndexException: Index not found, please create an instance before querying
The NoIndexException
can be fixed like described here https://github.com/hwchase17/langchain/issues/2491#issuecomment-1499082189
I need to train gpt4all with the BWB dataset (a large-scale document-level Chinese--English parallel dataset for machine translations). Is there any guide on how to do this?
Stale, please open a new issue if this still occurs