langchain
langchain copied to clipboard
gpt4all-j memory issue
System Info
langchain 0.0.166 Python 3.8.10 pygpt4all 1.1.0
Who can help?
@Vowe
Information
- [ ] The official example notebooks/scripts
- [X] My own modified scripts
Related Components
- [X] LLMs/Chat Models
- [ ] Embedding Models
- [X] Prompts / Prompt Templates / Prompt Selectors
- [ ] Output Parsers
- [ ] Document Loaders
- [ ] Vector Stores / Retrievers
- [X] Memory
- [ ] Agents / Agent Executors
- [ ] Tools / Toolkits
- [ ] Chains
- [ ] Callbacks/Tracing
- [ ] Async
Reproduction
from langchain import PromptTemplate, LLMChain from langchain.llms import GPT4All from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
template = """Question: {question} Answer:""" prompt = PromptTemplate(template=template, input_variables=["question"])
callbacks = [StreamingStdOutCallbackHandler()] llm = GPT4All(model='ggml-gpt4all-j-v1.3-groovy.bin',backend='gptj',callbacks=callbacks,verbose=True)
llm_chain = LLMChain(prompt=prompt,llm=llm)
question = "What is Walmart?" print(llm_chain.run(question=question))
question = "Summarize the previous response so a child can understand it" print(llm_chain.run(question=question))
Expected behavior
The above code snippet asks two questions of the gpt4all-j model. No memory is implemented in langchain. However, the response to the second question shows memory behavior when this is not expected.
The response to the first question was " Walmart is a retail company that sells a variety of products, including clothing, electronics, and food. It is one of the largest retailers in the world, with over 2,000 stores in the United States alone. Walmart is a retail company that sells a variety of products, including clothing, electronics, and food. It is one of the largest retailers in the world, with over 2,000 stores in the United States alone."
The response to the second question was "Walmart is a large retail store that sells a variety of things like clothes, electronics, and food. It is very big with many stores all over the United States. Walmart is a large retail store that sells a variety of things like clothes, electronics, and food. It is very big with many stores all over the United States."
Could someone verify that the GPT4All-J bindings don't default to implementing some memory?
I can tell you that I've observed the same behavior when using GPT 3.5. I don't think this is model-specific.
@emigre459 could you provide a code snippet? I can be of more help if so. Base chains have no memory so it's hard to guess at why you'd be experiencing this behavior
Agreed that it was very surprising since it was just a completion call essentially with no added memory. It was something like this:
from langchain.chat_models.azure_openai import AzureChatOpenAI
from langchain.schema import SystemMessage, HumanMessage
llm = AzureChatOpenAI(model_name=deployment_name, deployment_name=deployment_name, temperature=0)
system_prompt_old = """
You are a helpful assistant tasked with extracting document section headers from a DOCUMENT.
When you find one or more section headers, list them out. If you find no headers, return NOTHING.
"""
system_prompt_new = """
You are a helpful assistant tasked with extracting document section headers from a DOCUMENT.
When you find one or more section headers, list them out. If you find no headers, return NULL.
"""
human_prompt = f"DOCUMENT: {document}"
print(llm([SystemMessage(content=<one_of_the_system_prompts>), HumanMessage(content=human_prompt)]).content)
Basically what happened with me was that I was using it all in a Jupyter notebook. I started my experiments by using system_prompt_old
and then decided I wanted to switch the null value from NOTHING to NULL by using system_prompt_new
. But when I did so, my responses kept including "NOTHING" in them and never "NULL". It wasn't until I restarted my kernel that it started using "NULL" in completions.
This is a behavior that multiple people in my work group have observed in the past. What's especially odd is that it isn't consistent, it seems to happen randomly (I couldn't reproduce it just now for example).
Thank you for sharing! We'll see if we can reproduce something.
@josephtangy7 are you still observing the described behavior as of the latest langchain version?
@emigre459 It would seem I no longer observe this behavior. Since the original post, I have gpt4all version 0.2.3, langchain version 0.0.184, python version 3.8.10. After running the script below, the responses don't seem to remember context anymore (see attached screenshot below).
from langchain import PromptTemplate, LLMChain from langchain.llms import GPT4All
template = """Question: {question} Answer:"""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm = GPT4All(model='./ggml-gpt4all-j-v1.3-groovy.bin',backend='gptj')
llm_chain = LLMChain(prompt=prompt,llm=llm)
question = "What is Walmart?" llm_chain.run(question)
question = "Summarize the previous response so a child can understand it" llm_chain.run(question)
Hi, @josephtangy7! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, the issue you reported was related to a memory problem in the gpt4all-j model. It seems that the response to the second question was repeating the response from the first question. Users "vowelparrot" and "emigre459" discussed the issue, with "emigre459" providing a code snippet that reproduces the behavior. "emigre459" mentioned that restarting the kernel resolved the issue. You confirmed that you no longer observe the behavior with the latest versions of gpt4all and langchain.
Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your contribution to the LangChain repository, and please don't hesitate to reach out if you have any further questions or concerns.
Best regards, Dosu
Hi, @josephtangy7! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, the issue you reported was related to a memory problem in the gpt4all-j model. It seems that the response to the second question was repeating the response from the first question. Users "vowelparrot" and "emigre459" discussed the issue, with "emigre459" providing a code snippet that reproduces the behavior. "emigre459" mentioned that restarting the kernel resolved the issue. You confirmed that you no longer observe the behavior with the latest versions of gpt4all and langchain.
Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your contribution to the LangChain repository, and please don't hesitate to reach out if you have any further questions or concerns.
Best regards, Dosu