langgraph
langgraph copied to clipboard
Local RAG agent with LLaMA3 error: Ollama call failed with status code 400. Details: {"error":"unexpected server status: 1"}
Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a similar question and didn't find it.
- [X] I am sure that this is a bug in LangChain rather than my code.
Example Code
notebook example code in https://github.com/langchain-ai/langgraph/blob/main/examples/rag/langgraph_rag_agent_llama3_local.ipynb
Error Message and Stack Trace (if applicable)
{'score': 'yes'}
According to the context, agent memory refers to a long-term memory module (external database) that records a comprehensive list of agents' experience in natural language. This memory stream is used by generative agents to enable them to behave conditioned on past experience and interact with other agents.
Traceback (most recent call last):
File "/home/luca/pymaindir_icos/autocoders/lc_coder/lama3/local_llama3.py", line 139, in <module>
answer_grader.invoke({"question": question,"generation": generation})
File "/home/luca/anaconda3/envs/lc_coder/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 2499, in invoke
input = step.invoke(
^^^^^^^^^^^^
File "/home/luca/anaconda3/envs/lc_coder/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 158, in invoke
self.generate_prompt(
File "/home/luca/anaconda3/envs/lc_coder/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 560, in generate_prompt
return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/luca/anaconda3/envs/lc_coder/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 421, in generate
raise e
File "/home/luca/anaconda3/envs/lc_coder/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 411, in generate
self._generate_with_cache(
File "/home/luca/anaconda3/envs/lc_coder/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 632, in _generate_with_cache
result = self._generate(
^^^^^^^^^^^^^^^
File "/home/luca/anaconda3/envs/lc_coder/lib/python3.11/site-packages/langchain_community/chat_models/ollama.py", line 259, in _generate
final_chunk = self._chat_stream_with_aggregation(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/luca/anaconda3/envs/lc_coder/lib/python3.11/site-packages/langchain_community/chat_models/ollama.py", line 190, in _chat_stream_with_aggregation
for stream_resp in self._create_chat_stream(messages, stop, **kwargs):
File "/home/luca/anaconda3/envs/lc_coder/lib/python3.11/site-packages/langchain_community/chat_models/ollama.py", line 162, in _create_chat_stream
yield from self._create_stream(
^^^^^^^^^^^^^^^^^^^^
File "/home/luca/anaconda3/envs/lc_coder/lib/python3.11/site-packages/langchain_community/llms/ollama.py", line 251, in _create_stream
raise ValueError(
ValueError: Ollama call failed with status code 400. Details: {"error":"unexpected server status: 1"}
Description
running the example code i get the above error, this is not happening with mistral so i guess my ollama is ok. I also get the first "yes" from the llama3 if I'm not mistaken, so I supsect it's related to something not working here:
from pprint import pprint
inputs = {"question": "What are the types of agent memory?"}
for output in app.stream(inputs):
for key, value in output.items():
pprint(f"Finished running: {key}:")
pprint(value["generation"])
System Info
Ubuntu 22.04.4 LTS Anaconda and VSC.
I have been experiencing the same issue. It seems to happen at random when Ollama is requested. This thread suggests configuring a retry method: https://github.com/langchain-ai/langchain/issues/20773#issuecomment-2072117003 Seems to work, but would be nice to get an official fix.
I'm seeing the same issue, the retry referenced here seems to help at times, but not 100%: https://github.com/langchain-ai/langchain/issues/20773#issuecomment-2072117003
I can second this exactly:
I'm seeing the same issue, the retry referenced here seems to help at times, but not 100%: langchain-ai/langchain#20773 (comment)
Workaround works (sometimes) after updating Ollama to 0.1.9
Still an issue, not really functional on Ollama 0.1.32.
EDIT: resolved. Solved for me by ensuring other Ollama instances on the system (other Ubuntu instances under WSL or on host Windows machine were off (or uninstalled for the Windows version). Ollama may have a bug related to stopping the server.
This seems to be more of an Ollama issue in this case? Or is there something specific to this notebook that you want fixed
It's a terrific notebook and I'd love to see it working with ollama and llama 3. I believe the issue affects every llama 3 implementation so fixing it would help greatly.