langchain
langchain copied to clipboard
Error 400 from Ollama while generation at random cell runs
Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a similar question and didn't find it.
- [X] I am sure that this is a bug in LangChain rather than my code.
- [X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
Example Code
from langchain.prompts import PromptTemplate from langchain_core.output_parsers import StrOutputParser
system_message = '''You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.'''
user_message = '''Question: {question} Context: {context} Answer:'''
prompt = PromptTemplate( template=llama3_prompt_template.format(system_message=system_message, user_message=user_message), input_variables=['question', 'context'] )
llm = ChatOllama(model=local_llm, temperature=0)
def format_docs(docs): return "\n\n".join([doc.page_content for doc in docs])
rag_chain = prompt | llm | StrOutputParser()
question = 'agent memory' docs = retriever.invoke(question) generation = rag_chain.invoke({'question': question, 'context': docs}) print(generation)
Error Message and Stack Trace (if applicable)
ValueError('Ollama call failed with status code 400. Details: {"error":"unexpected server status: 1"}')Traceback (most recent call last):
File "/home/darthcoder/miniconda3/envs/LangChain/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 411, in generate self._generate_with_cache(
File "/home/darthcoder/miniconda3/envs/LangChain/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 632, in _generate_with_cache result = self._generate(
File "/home/darthcoder/miniconda3/envs/LangChain/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 259, in _generate final_chunk = self._chat_stream_with_aggregation(
File "/home/darthcoder/miniconda3/envs/LangChain/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 190, in _chat_stream_with_aggregation for stream_resp in self._create_chat_stream(messages, stop, **kwargs):
File "/home/darthcoder/miniconda3/envs/LangChain/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 162, in _create_chat_stream yield from self._create_stream(
File "/home/darthcoder/miniconda3/envs/LangChain/lib/python3.10/site-packages/langchain_community/llms/ollama.py", line 251, in _create_stream raise ValueError(
ValueError: Ollama call failed with status code 400. Details: {"error":"unexpected server status: 1"}
Description
I am implementing this demo - https://github.com/langchain-ai/langgraph/blob/main/examples/rag/langgraph_rag_agent_llama3_local.ipynb - from LangChain's Youtube video The same cell during Generation sometimes runs but sometimes gives the Error 400 This occurs for other code cells too at random
System Info
System Information
OS: Linux OS Version: #1 SMP Thu Jan 11 04:09:03 UTC 2024 Python Version: 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0]
Package Information
langchain_core: 0.1.45 langchain: 0.1.16 langchain_community: 0.0.34 langsmith: 0.1.49 langchain_chroma: 0.1.0 langchain_cli: 0.0.21 langchain_experimental: 0.0.53 langchain_groq: 0.1.2 langchain_nomic: 0.0.2 langchain_pinecone: 0.0.3 langchain_text_splitters: 0.0.1 langchainhub: 0.1.15 langgraph: 0.0.38 langserve: 0.1.0
I've a hacky fix until it is fixed in the upstream, you can add a fallback like this
rag_chain_fallback = prompt | llm | StrOutputParser()
rag_chain = rag_chain_fallback.with_fallbacks([rag_chain_fallback])
or add a 'retry' like this
rag_chain = prompt | llm | StrOutputParser()
rag_chain = rag_chain.with_retry()
https://python.langchain.com/docs/guides/productionization/fallbacks/
Thanks man, will try and update if any issues faced
with_retry() works well as of now, Thanks @Bassileios Others please verify if this issue is being faced by you guys too. It helps knowing I am not the only one
@SinghJivjot have this exact issue aswell Same error in same way.
Exact error:
File "/home/Test/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 166, in invoke
self.generate_prompt(
File "/home/Test/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 544, in generate_prompt
return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
File "/home/Test/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 408, in generate
raise e
File "/home/Test/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 398, in generate
self._generate_with_cache(
File "/home/Test/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 577, in _generate_with_cache
return self._generate(
File "/home/Test/.local/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 255, in _generate
final_chunk = self._chat_stream_with_aggregation(
File "/home/Test/.local/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 188, in _chat_stream_with_aggregation
for stream_resp in self._create_chat_stream(messages, stop, **kwargs):
File "/home/Test/.local/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 161, in _create_chat_stream
yield from self._create_stream(
File "/home/Test/.local/lib/python3.10/site-packages/langchain_community/llms/ollama.py", line 240, in _create_stream
raise ValueError(
ValueError: Ollama call failed with status code 400. Details: unexpected server status: 1
The ollama logs show
level=ERROR source=prompt.go:86 msg="failed to encode prompt" err="unexpected server status: 1"
If that helps anyone.
Can dig deeper if required as its very reproducable with me even with ".with_retry()"
Retry did work for all the chains in the llama local notebook, except for this line:
# Compile
app = workflow.compile()
# Test
from pprint import pprint
inputs = {"question": "What are the types of agent memory?"}
for output in app.stream(inputs): # throws the same error
EDIT: providing fulltrace of the error per @SinghJivjot request
Traceback (most recent call last):
File "/home/yan/code_ws/rag-experiment/langgraph_rag_agent_llama3.py", line 421, in <module>
for output in app.stream(inputs):
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langgraph/pregel/__init__.py", line 710, in stream
_panic_or_proceed(done, inflight, step)
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langgraph/pregel/__init__.py", line 1126, in _panic_or_proceed
raise exc
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2499, in invoke
input = step.invoke(
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3963, in invoke
return self._call_with_config(
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 1626, in _call_with_config
context.run(
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 347, in call_func_with_variable_args
return func(input, **kwargs) # type: ignore[call-arg]
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3837, in _invoke
output = call_func_with_variable_args(
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 347, in call_func_with_variable_args
return func(input, **kwargs) # type: ignore[call-arg]
File "/home/yan/code_ws/rag-experiment/langgraph_rag_agent_llama3.py", line 247, in grade_documents
score = retrieval_grader.invoke({"question": question, "document": d.page_content})
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2499, in invoke
input = step.invoke(
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 158, in invoke
self.generate_prompt(
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 560, in generate_prompt
return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 421, in generate
raise e
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 411, in generate
self._generate_with_cache(
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 632, in _generate_with_cache
result = self._generate(
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 259, in _generate
final_chunk = self._chat_stream_with_aggregation(
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 190, in _chat_stream_with_aggregation
for stream_resp in self._create_chat_stream(messages, stop, **kwargs):
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 162, in _create_chat_stream
yield from self._create_stream(
File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_community/llms/ollama.py", line 251, in _create_stream
raise ValueError(
ValueError: Ollama call failed with status code 400. Details: {"error":"unexpected server status: 1"}
Ollama server log shows the same "failed to encode prompt" error:
Apr 25 23:40:18 yan-Legion-T7-34IAZ7 ollama[1708583]: time=2024-04-25T23:40:18.056-04:00 level=ERROR source=prompt.go:86 msg="failed to encode prompt" err="unexpected server status: >
Apr 25 23:40:18 yan-Legion-T7-34IAZ7 ollama[1708583]: [GIN] 2024/04/25 - 23:40:18 | 400 | 66.817677ms | 127.0.0.1 | POST "/api/chat"
@y22ma Please send the full trace
Workaround works (sometimes) after updating Ollama to 0.1.9
same question, with my trace:
root@8cb277eb03b5:/home/python# python test.py
Traceback (most recent call last):
File "/home/python/test.py", line 24, in <module>
result = smart_scraper_graph.run()
File "/usr/local/lib/python3.9/site-packages/scrapegraphai/graphs/smart_scraper_graph.py", line 116, in run
self.final_state, self.execution_info = self.graph.execute(inputs)
File "/usr/local/lib/python3.9/site-packages/scrapegraphai/graphs/base_graph.py", line 107, in execute
result = current_node.execute(state)
File "/usr/local/lib/python3.9/site-packages/scrapegraphai/nodes/generate_answer_node.py", line 141, in execute
answer = merge_chain.invoke(
File "/usr/local/lib/python3.9/site-packages/langchain_core/runnables/base.py", line 2499, in invoke
input = step.invoke(
File "/usr/local/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py", line 158, in invoke
self.generate_prompt(
File "/usr/local/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py", line 560, in generate_prompt
return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
File "/usr/local/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py", line 421, in generate
raise e
File "/usr/local/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py", line 411, in generate
self._generate_with_cache(
File "/usr/local/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py", line 632, in _generate_with_cache
result = self._generate(
File "/usr/local/lib/python3.9/site-packages/langchain_community/chat_models/ollama.py", line 259, in _generate
final_chunk = self._chat_stream_with_aggregation(
File "/usr/local/lib/python3.9/site-packages/langchain_community/chat_models/ollama.py", line 190, in _chat_stream_with_aggregation
for stream_resp in self._create_chat_stream(messages, stop, **kwargs):
File "/usr/local/lib/python3.9/site-packages/langchain_community/chat_models/ollama.py", line 162, in _create_chat_stream
yield from self._create_stream(
File "/usr/local/lib/python3.9/site-packages/langchain_community/llms/ollama.py", line 251, in _create_stream
raise ValueError(
ValueError: Ollama call failed with status code 400. Details: {"error":"unexpected server status: 1"}