langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Error 400 from Ollama while generation at random cell runs

Open SinghJivjot opened this issue 10 months ago • 9 comments

Checked other resources

  • [X] I added a very descriptive title to this issue.
  • [X] I searched the LangChain documentation with the integrated search.
  • [X] I used the GitHub search to find a similar question and didn't find it.
  • [X] I am sure that this is a bug in LangChain rather than my code.
  • [X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain.prompts import PromptTemplate from langchain_core.output_parsers import StrOutputParser

system_message = '''You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.'''

user_message = '''Question: {question} Context: {context} Answer:'''

prompt = PromptTemplate( template=llama3_prompt_template.format(system_message=system_message, user_message=user_message), input_variables=['question', 'context'] )

llm = ChatOllama(model=local_llm, temperature=0)

def format_docs(docs): return "\n\n".join([doc.page_content for doc in docs])

rag_chain = prompt | llm | StrOutputParser()

question = 'agent memory' docs = retriever.invoke(question) generation = rag_chain.invoke({'question': question, 'context': docs}) print(generation)

Error Message and Stack Trace (if applicable)

ValueError('Ollama call failed with status code 400. Details: {"error":"unexpected server status: 1"}')Traceback (most recent call last):

File "/home/darthcoder/miniconda3/envs/LangChain/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 411, in generate self._generate_with_cache(

File "/home/darthcoder/miniconda3/envs/LangChain/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 632, in _generate_with_cache result = self._generate(

File "/home/darthcoder/miniconda3/envs/LangChain/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 259, in _generate final_chunk = self._chat_stream_with_aggregation(

File "/home/darthcoder/miniconda3/envs/LangChain/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 190, in _chat_stream_with_aggregation for stream_resp in self._create_chat_stream(messages, stop, **kwargs):

File "/home/darthcoder/miniconda3/envs/LangChain/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 162, in _create_chat_stream yield from self._create_stream(

File "/home/darthcoder/miniconda3/envs/LangChain/lib/python3.10/site-packages/langchain_community/llms/ollama.py", line 251, in _create_stream raise ValueError(

ValueError: Ollama call failed with status code 400. Details: {"error":"unexpected server status: 1"}

Description

I am implementing this demo - https://github.com/langchain-ai/langgraph/blob/main/examples/rag/langgraph_rag_agent_llama3_local.ipynb - from LangChain's Youtube video The same cell during Generation sometimes runs but sometimes gives the Error 400 This occurs for other code cells too at random

System Info

System Information

OS: Linux OS Version: #1 SMP Thu Jan 11 04:09:03 UTC 2024 Python Version: 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0]

Package Information

langchain_core: 0.1.45 langchain: 0.1.16 langchain_community: 0.0.34 langsmith: 0.1.49 langchain_chroma: 0.1.0 langchain_cli: 0.0.21 langchain_experimental: 0.0.53 langchain_groq: 0.1.2 langchain_nomic: 0.0.2 langchain_pinecone: 0.0.3 langchain_text_splitters: 0.0.1 langchainhub: 0.1.15 langgraph: 0.0.38 langserve: 0.1.0

SinghJivjot avatar Apr 23 '24 06:04 SinghJivjot

I've a hacky fix until it is fixed in the upstream, you can add a fallback like this

rag_chain_fallback = prompt | llm | StrOutputParser() rag_chain = rag_chain_fallback.with_fallbacks([rag_chain_fallback])

or add a 'retry' like this rag_chain = prompt | llm | StrOutputParser() rag_chain = rag_chain.with_retry()

https://python.langchain.com/docs/guides/productionization/fallbacks/

Bassileios avatar Apr 23 '24 11:04 Bassileios

Thanks man, will try and update if any issues faced

SinghJivjot avatar Apr 23 '24 18:04 SinghJivjot

with_retry() works well as of now, Thanks @Bassileios Others please verify if this issue is being faced by you guys too. It helps knowing I am not the only one

SinghJivjot avatar Apr 24 '24 10:04 SinghJivjot

@SinghJivjot have this exact issue aswell Same error in same way.

Sloox avatar Apr 24 '24 15:04 Sloox

Exact error:

File "/home/Test/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 166, in invoke
    self.generate_prompt(
  File "/home/Test/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 544, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
  File "/home/Test/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 408, in generate
    raise e
  File "/home/Test/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 398, in generate
    self._generate_with_cache(
  File "/home/Test/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 577, in _generate_with_cache
    return self._generate(
  File "/home/Test/.local/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 255, in _generate
    final_chunk = self._chat_stream_with_aggregation(
  File "/home/Test/.local/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 188, in _chat_stream_with_aggregation
    for stream_resp in self._create_chat_stream(messages, stop, **kwargs):
  File "/home/Test/.local/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 161, in _create_chat_stream
    yield from self._create_stream(
  File "/home/Test/.local/lib/python3.10/site-packages/langchain_community/llms/ollama.py", line 240, in _create_stream
    raise ValueError(
ValueError: Ollama call failed with status code 400. Details: unexpected server status: 1

The ollama logs show level=ERROR source=prompt.go:86 msg="failed to encode prompt" err="unexpected server status: 1" If that helps anyone. Can dig deeper if required as its very reproducable with me even with ".with_retry()"

Sloox avatar Apr 24 '24 15:04 Sloox

Retry did work for all the chains in the llama local notebook, except for this line:

# Compile
app = workflow.compile()

# Test
from pprint import pprint
inputs = {"question": "What are the types of agent memory?"}
for output in app.stream(inputs): # throws the same error

EDIT: providing fulltrace of the error per @SinghJivjot request

Traceback (most recent call last):
  File "/home/yan/code_ws/rag-experiment/langgraph_rag_agent_llama3.py", line 421, in <module>
    for output in app.stream(inputs):
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langgraph/pregel/__init__.py", line 710, in stream
    _panic_or_proceed(done, inflight, step)
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langgraph/pregel/__init__.py", line 1126, in _panic_or_proceed
    raise exc
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2499, in invoke
    input = step.invoke(
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3963, in invoke
    return self._call_with_config(
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 1626, in _call_with_config
    context.run(
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 347, in call_func_with_variable_args
    return func(input, **kwargs)  # type: ignore[call-arg]
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3837, in _invoke
    output = call_func_with_variable_args(
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 347, in call_func_with_variable_args
    return func(input, **kwargs)  # type: ignore[call-arg]
  File "/home/yan/code_ws/rag-experiment/langgraph_rag_agent_llama3.py", line 247, in grade_documents
    score = retrieval_grader.invoke({"question": question, "document": d.page_content})
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2499, in invoke
    input = step.invoke(
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 158, in invoke
    self.generate_prompt(
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 560, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 421, in generate
    raise e
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 411, in generate
    self._generate_with_cache(
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 632, in _generate_with_cache
    result = self._generate(
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 259, in _generate
    final_chunk = self._chat_stream_with_aggregation(
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 190, in _chat_stream_with_aggregation
    for stream_resp in self._create_chat_stream(messages, stop, **kwargs):
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py", line 162, in _create_chat_stream
    yield from self._create_stream(
  File "/home/yan/miniconda3/envs/oscopilot/lib/python3.10/site-packages/langchain_community/llms/ollama.py", line 251, in _create_stream
    raise ValueError(
ValueError: Ollama call failed with status code 400. Details: {"error":"unexpected server status: 1"}

Ollama server log shows the same "failed to encode prompt" error:

Apr 25 23:40:18 yan-Legion-T7-34IAZ7 ollama[1708583]: time=2024-04-25T23:40:18.056-04:00 level=ERROR source=prompt.go:86 msg="failed to encode prompt" err="unexpected server status: >
Apr 25 23:40:18 yan-Legion-T7-34IAZ7 ollama[1708583]: [GIN] 2024/04/25 - 23:40:18 | 400 |   66.817677ms |       127.0.0.1 | POST     "/api/chat"

y22ma avatar Apr 26 '24 12:04 y22ma

@y22ma Please send the full trace

SinghJivjot avatar Apr 26 '24 13:04 SinghJivjot

Workaround works (sometimes) after updating Ollama to 0.1.9

luca-git avatar May 01 '24 19:05 luca-git

same question, with my trace:

root@8cb277eb03b5:/home/python# python test.py
Traceback (most recent call last):
  File "/home/python/test.py", line 24, in <module>
    result = smart_scraper_graph.run()
  File "/usr/local/lib/python3.9/site-packages/scrapegraphai/graphs/smart_scraper_graph.py", line 116, in run
    self.final_state, self.execution_info = self.graph.execute(inputs)
  File "/usr/local/lib/python3.9/site-packages/scrapegraphai/graphs/base_graph.py", line 107, in execute
    result = current_node.execute(state)
  File "/usr/local/lib/python3.9/site-packages/scrapegraphai/nodes/generate_answer_node.py", line 141, in execute
    answer = merge_chain.invoke(
  File "/usr/local/lib/python3.9/site-packages/langchain_core/runnables/base.py", line 2499, in invoke
    input = step.invoke(
  File "/usr/local/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py", line 158, in invoke
    self.generate_prompt(
  File "/usr/local/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py", line 560, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py", line 421, in generate
    raise e
  File "/usr/local/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py", line 411, in generate
    self._generate_with_cache(
  File "/usr/local/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py", line 632, in _generate_with_cache
    result = self._generate(
  File "/usr/local/lib/python3.9/site-packages/langchain_community/chat_models/ollama.py", line 259, in _generate
    final_chunk = self._chat_stream_with_aggregation(
  File "/usr/local/lib/python3.9/site-packages/langchain_community/chat_models/ollama.py", line 190, in _chat_stream_with_aggregation
    for stream_resp in self._create_chat_stream(messages, stop, **kwargs):
  File "/usr/local/lib/python3.9/site-packages/langchain_community/chat_models/ollama.py", line 162, in _create_chat_stream
    yield from self._create_stream(
  File "/usr/local/lib/python3.9/site-packages/langchain_community/llms/ollama.py", line 251, in _create_stream
    raise ValueError(
ValueError: Ollama call failed with status code 400. Details: {"error":"unexpected server status: 1"}

yichaosun avatar May 17 '24 04:05 yichaosun