Ollama tool calls not working via openai proxy only when using langgraph
Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangGraph/LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a similar question and didn't find it.
- [X] I am sure that this is a bug in LangGraph/LangChain rather than my code.
- [X] I am sure this is better as an issue rather than a GitHub discussion, since this is a LangGraph bug and not a design question.
Example Code
from typing import Literal
from langchain_core.messages import HumanMessage
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.checkpoint import MemorySaver
from langgraph.graph import END, MessagesState, StateGraph
from langgraph.prebuilt import ToolNode
import asyncio
@tool
def search(query: str):
"""Call to surf the web."""
if "sf" in query.lower() or "san francisco" in query.lower():
return "It's 60 degrees and foggy."
return "It's 90 degrees and sunny."
tools = [search]
tool_node = ToolNode(tools)
model = ChatOpenAI(
model="llama3.1", base_url="http://localhost:11434/v1", temperature=0
).bind_tools(tools)
def should_continue(state: MessagesState) -> Literal["tools", END]:
messages = state["messages"]
last_message = messages[-1]
if last_message.tool_calls:
return "tools"
return END
async def call_model(state: MessagesState, config):
messages = state["messages"]
response = await model.ainvoke(messages, config)
return {"messages": [response]}
workflow = StateGraph(MessagesState)
workflow.add_node("agent", call_model)
workflow.add_node("tools", tool_node)
workflow.set_entry_point("agent")
workflow.add_conditional_edges(
"agent",
should_continue,
)
workflow.add_edge("tools", "agent")
checkpointer = MemorySaver()
app = workflow.compile(checkpointer=checkpointer)
async def test():
async for event in app.astream_events(
{"messages": [HumanMessage(content="what is the weather in sf")]},
version="v1",
config={"configurable": {"thread_id": 42}},
):
print(event)
asyncio.run(test())
Error Message and Stack Trace (if applicable)
No response
Description
I want to invoke a tool-calling compatible Ollama model through ChatOpenAI proxy. However, using the code above, the model does not properly tool call:
{'event': 'on_chain_end', 'data': {'output': {'messages': [HumanMessage(content='what is the weather in sf', id='f7017ae4-b2d0-49e3-b939-69738686368b'), AIMessage(content='{"name": "search", "parameters": {"query": "sf weather"}}', response_metadata={'finish_reason': 'stop', 'model_name': 'llama3.1', 'system_fingerprint': 'fp_ollama'}, id='run-6a214185-27ba-4505-9cc2-574f20d04909')]}}, 'run_id': 'b2aeba64-38b0-447a-b7de-eefff49e3555', 'name': 'LangGraph', 'tags': [], 'metadata': {'thread_id': 43}, 'parent_ids': []}
However, the behaviour is different when using just langchain:
import asyncio
import openai
from langchain_core.messages import HumanMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
@tool
def search(query: str):
"""Call to surf the web."""
if "sf" in query.lower() or "san francisco" in query.lower():
return "It's 60 degrees and foggy."
return "It's 90 degrees and sunny."
model = ChatOpenAI(model="llama3.1", base_url="http://localhost:11434/v1", temperature=0)
model_with_tools = model.bind_tools([search])
async def test():
prompt = ChatPromptTemplate.from_messages(
[
MessagesPlaceholder(variable_name="messages"),
]
)
chain = prompt | model_with_tools
response = await chain.ainvoke(
{"messages": [HumanMessage(content="What is the weather like in sf")]}
)
return response
response = asyncio.run(test())
print(response)
This way the model correctly utilise a tool call:
content='' additional_kwargs={'tool_calls': [{'id': 'call_xncx3ycn', 'function': {'arguments': '{"query":"sf weather"}', 'name': 'search'}, 'type': 'function'}]} response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 148, 'total_tokens': 165}, 'model_name': 'llama3.1', 'system_fingerprint': 'fp_ollama', 'finish_reason': 'stop', 'logprobs': None} id='run-63a9efd2-6619-448b-9a89-476f45cfb5c8-0' tool_calls=[{'name': 'search', 'args': {'query': 'sf weather'}, 'id': 'call_xncx3ycn', 'type': 'tool_call'}] usage_metadata={'input_tokens': 148, 'output_tokens': 17, 'total_tokens': 165}
System Info
langchain==0.2.7 langchain-anthropic==0.1.20 langchain-cohere==0.1.5 langchain-community==0.2.7 langchain-core==0.2.21 langchain-google-genai==1.0.5 langchain-ollama==0.1.0 langchain-openai==0.1.17 langchain-qdrant==0.1.1 langchain-text-splitters==0.2.0 langchain-weaviate==0.0.1.post1
platform: mac silicon python version: Python 3.12.2
Oh interesting. How many times did you run both versions? Is it reliably different in both contexts? And then you've confirmed the ollama versions are the same in both scenarios?
Hi @hinthornw , I ran them multiple times (and once more just now just to be sure) and the behaviour is consistent. Tested both on ollama v0.3.0.
It seems to be cause by astream_events method. Using langgraph with astream works:
...
async def test():
async for event in app.astream(
{"messages": [HumanMessage(content="what is the weather in sf")]},
config={"configurable": {"thread_id": 42}},
):
print(event)
asyncio.run(test())
{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_rs3ykbgl', 'function': {'arguments': '{"query":"sf weather"}', 'name': 'search'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 147, 'total_tokens': 164}, 'model_name': 'llama3.1', 'system_fingerprint': 'fp_ollama', 'finish_reason': 'stop', 'logprobs': None}, id='run-3175464b-50ce-4fa7-afc7-73fe14cf92ee-0', tool_calls=[{'name': 'search', 'args': {'query': 'sf weather'}, 'id': 'call_rs3ykbgl', 'type': 'tool_call'}], usage_metadata={'input_tokens': 147, 'output_tokens': 17, 'total_tokens': 164})]}}
{'tools': {'messages': [ToolMessage(content="It's 60 degrees and foggy.", name='search', tool_call_id='call_rs3ykbgl')]}}
{'agent': {'messages': [AIMessage(content='Based on the tool call response, I can format an answer to your original question:\n\nThe current weather in San Francisco (SF) is 60 degrees with fog.', response_metadata={'token_usage': {'completion_tokens': 34, 'prompt_tokens': 86, 'total_tokens': 120}, 'model_name': 'llama3.1', 'system_fingerprint': 'fp_ollama', 'finish_reason': 'stop', 'logprobs': None}, id='run-0c048f94-63ae-45d1-9808-4083dd65ec0d-0', usage_metadata={'input_tokens': 86, 'output_tokens': 34, 'total_tokens': 120})]}}
I encountered the same issue, how to get tool_calls and event when using astream_events? should set specific config? or it is not supported in langgraph?
It seems to be cause by
astream_eventsmethod. Using langgraph withastreamworks:... async def test(): async for event in app.astream( {"messages": [HumanMessage(content="what is the weather in sf")]}, config={"configurable": {"thread_id": 42}}, ): print(event) asyncio.run(test()){'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_rs3ykbgl', 'function': {'arguments': '{"query":"sf weather"}', 'name': 'search'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 147, 'total_tokens': 164}, 'model_name': 'llama3.1', 'system_fingerprint': 'fp_ollama', 'finish_reason': 'stop', 'logprobs': None}, id='run-3175464b-50ce-4fa7-afc7-73fe14cf92ee-0', tool_calls=[{'name': 'search', 'args': {'query': 'sf weather'}, 'id': 'call_rs3ykbgl', 'type': 'tool_call'}], usage_metadata={'input_tokens': 147, 'output_tokens': 17, 'total_tokens': 164})]}} {'tools': {'messages': [ToolMessage(content="It's 60 degrees and foggy.", name='search', tool_call_id='call_rs3ykbgl')]}} {'agent': {'messages': [AIMessage(content='Based on the tool call response, I can format an answer to your original question:\n\nThe current weather in San Francisco (SF) is 60 degrees with fog.', response_metadata={'token_usage': {'completion_tokens': 34, 'prompt_tokens': 86, 'total_tokens': 120}, 'model_name': 'llama3.1', 'system_fingerprint': 'fp_ollama', 'finish_reason': 'stop', 'logprobs': None}, id='run-0c048f94-63ae-45d1-9808-4083dd65ec0d-0', usage_metadata={'input_tokens': 86, 'output_tokens': 34, 'total_tokens': 120})]}}
it's not work for me... I still cannot call functions...
{'messages': ['what is the weather in sf',
AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_wurji20e', 'function': {'arguments': '{"query":"weather in sf"}', 'name': 'search'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 209, 'total_tokens': 227, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'llama3.1', 'system_fingerprint': 'fp_ollama', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-8834569e-3531-4c0e-a47b-bca59c600db1-0', tool_calls=[{'name': 'search', 'args': {'query': 'weather in sf'}, 'id': 'call_wurji20e', 'type': 'tool_call'}], usage_metadata={'input_tokens': 209, 'output_tokens': 18, 'total_tokens': 227, 'input_token_details': {}, 'output_token_details': {}})]}
Thanks, it works.
Has this been solved in LangGraph? I have the same issue. I tried using llm=ChatOllama(model="llama3.1") for simplicity but agent seems to not make the tool call: {'tools': {'messages': []}}.
My create_agent function:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
def create_agent(llm, tools, system_message: str):
"""Create an agent."""
base_message = (
"You are a helpful AI assistant, collaborating with other assistants."
" If you are unable to fully answer, that's OK, another assistant with different tools"
" will help where you left off. Execute what you can to make progress."
)
if tools:
prompt = ChatPromptTemplate.from_messages([
(
"system",
base_message +
" You have access to the following tools: {tool_names}.\n{system_message}",
),
MessagesPlaceholder(variable_name="messages"),
])
prompt = prompt.partial(system_message=system_message)
prompt = prompt.partial(tool_names=", ".join([tool.name for tool in tools]))
return prompt | llm.bind_tools(tools)
else:
prompt = ChatPromptTemplate.from_messages([
(
"system",
base_message + "\n{system_message}",
),
MessagesPlaceholder(variable_name="messages"),
])
prompt = prompt.partial(system_message=system_message)
return prompt | llm
I think this may be related, I am seeing issues with astreams, and ChatOllama not being able to run tools. Using the LLM before binding tools works, but after calling .bind_tools() it just sends two responses and then silence.
# this can be used with astreams, it doesnt have tools bound
self.llm = ChatOllama(
model="llama3.1",
base_url="http://192.168.0.14:11434/",
temperature=0.0,
seed=42,
stream=True
)
# this cant be used with astreams
self.llm_with_tools = self.llm.bind_tools([tavily_search])
# In the chatbot function
async for token in self.llm.astream(messages): # doesn't work with a LLM that has bind_tools()
logger.info(token)
Output with LLM that has not bound tools
2025-02-02 21:53:30.840 | INFO | glados.glados:chatbot:46 - messages: [HumanMessage(content='hi', additional_kwargs={}, response_metadata={}, id='d9065e8f-8db4-4ec8-a1a8-0423a332649c')]
2025-02-02 21:53:32.585 | INFO | glados.glados:chatbot:50 - content='How' additional_kwargs={} response_metadata={} id='run-0277b3e0-1ec9-43fe-88b6-0386951b6112'
2025-02-02 21:53:32.599 | INFO | glados.glados:chatbot:50 - content="'s" additional_kwargs={} response_metadata={} id='run-0277b3e0-1ec9-43fe-88b6-0386951b6112'
2025-02-02 21:53:32.608 | INFO | glados.glados:chatbot:50 - content=' it' additional_kwargs={} response_metadata={} id='run-0277b3e0-1ec9-43fe-88b6-0386951b6112'
2025-02-02 21:53:32.617 | INFO | glados.glados:chatbot:50 - content=' going' additional_kwargs={} response_metadata={} id='run-0277b3e0-1ec9-43fe-88b6-0386951b6112'
2025-02-02 21:53:32.626 | INFO | glados.glados:chatbot:50 - content='?' additional_kwargs={} response_metadata={} id='run-0277b3e0-1ec9-43fe-88b6-0386951b6112'
2025-02-02 21:53:32.626 | INFO | glados.glados:send_to_tts:66 - 🔊 Sending to TTS: How 's it going ?
The output when tools have been bound.
User: hi
2025-02-02 21:54:44.004 | INFO | glados.glados:chatbot:46 - messages: [HumanMessage(content='hi', additional_kwargs={}, response_metadata={}, id='41ea7fbd-1bee-4d0f-a0c6-fa82ea8f4227')]
2025-02-02 21:54:44.417 | INFO | glados.glados:chatbot:50 - content='' additional_kwargs={} response_metadata={} id='run-5cf7bb0b-1744-484d-82e3-6680a7755c32' tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'hi'}, 'id': '5e76b24e-e324-4635-b6c2-ceb08a150327', 'type': 'tool_call'}] tool_call_chunks=[{'name': 'tavily_search_results_json', 'args': '{"query": "hi"}', 'id': '5e76b24e-e324-4635-b6c2-ceb08a150327', 'index': None, 'type': 'tool_call_chunk'}]
User: 2025-02-02 21:54:44.425 | INFO | glados.glados:chatbot:50 - content='' additional_kwargs={} response_metadata={'model': 'llama3.1', 'created_at': '2025-02-02T20:54:44.4251252Z', 'done': True, 'done_reason': 'stop', 'total_duration': 413467300, 'load_duration': 25397300, 'prompt_eval_count': 189, 'prompt_eval_duration': 86000000, 'eval_count': 21, 'eval_duration': 300000000, 'message': Message(role='assistant', content='', images=None, tool_calls=None)} id='run-5cf7bb0b-1744-484d-82e3-6680a7755c32' usage_metadata={'input_tokens': 189, 'output_tokens': 21, 'total_tokens': 210}
The entire class
import asyncio
from typing import Annotated
from langchain_community.tools import TavilySearchResults
from langchain_ollama import ChatOllama
from langgraph.graph import StateGraph
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode, create_react_agent, tools_condition
from loguru import logger
from typing_extensions import TypedDict
tavily_search = TavilySearchResults(max_results=2)
class State(TypedDict):
messages: Annotated[list, add_messages]
class GlaDOS:
def __init__(self):
self.graph_builder = StateGraph(State)
# this can be used with astreams, if it DOESNT have tools bound
self.llm = ChatOllama(
model="llama3.1",
base_url="http://192.168.0.14:11434/",
temperature=0.0,
seed=42,
stream=True
).bind_tools([tavily_search]) # remove this to get the LLM to work, but without tools :/
tool_node = ToolNode([tavily_search])
self.graph_builder.add_node("chatbot", self.chatbot)
self.graph_builder.add_node("tools", tool_node)
self.graph_builder.add_conditional_edges("chatbot", tools_condition)
self.graph_builder.add_edge("tools", "chatbot")
self.graph_builder.set_entry_point("chatbot")
self.graph = self.graph_builder.compile()
print(self.graph.get_graph().draw_mermaid())
async def chatbot(self, state: State):
"""Asynchronous chatbot function that streams responses and sends partial updates to TTS."""
messages = state["messages"]
logger.info(f"messages: {messages}")
accumulated_text = ""
async for token in self.llm.astream(messages): # doesn't work with a LLM that has bind_tools()
logger.info(token)
if hasattr(token, "content"):
token_text = token.content.strip()
logger.debug(f"Streaming Token: {token_text}")
accumulated_text += token_text + " "
yield {"messages": [{"role": "assistant", "content": token_text}]}
if len(accumulated_text) > 40 or "." in token_text or "!" in token_text or "?" in token_text:
self.send_to_tts(accumulated_text.strip())
accumulated_text = "" # Reset buffer after sending
if accumulated_text:
self.send_to_tts(accumulated_text.strip())
def send_to_tts(self, text: str):
"""Send final streamed response to TTS for speaking."""
if text.strip():
logger.info(f"🔊 Sending to TTS: {text}")
glados = GlaDOS()
async def stream_graph_updates(user_input: str):
""" Asynchronously streams responses from LangGraph """
async for event, metadata in glados.graph.astream(
{"messages": [{"role": "user", "content": user_input}]},
stream_mode="messages"
):
logger.debug(f"Received event: {event}")
# Handle AIMessageChunk directly instead of assuming it's a dict
if hasattr(event, "content"):
logger.debug(event.content)
else:
logger.warning(f"Unexpected event type: {type(event)}")
async def main():
""" Main loop to interact with the chatbot """
while True:
try:
user_input = input("User: ")
if user_input.lower() in ["quit", "exit", "q"]:
print("Goodbye!")
break
await stream_graph_updates(user_input)
except Exception as e:
logger.exception(f"Error: {e}")
# Run with asyncio event loop
if __name__ == '__main__':
asyncio.run(main())
I think this may be related, I am seeing issues with
astreams, andChatOllamanot being able to run tools. Using the LLM before binding tools works, but after calling.bind_tools()it just sends two responses and then silence.# this can be used with astreams, it doesnt have tools bound self.llm = ChatOllama( model="llama3.1", base_url="http://192.168.0.14:11434/", temperature=0.0, seed=42, stream=True ) # this cant be used with astreams self.llm_with_tools = self.llm.bind_tools([tavily_search]) # In the chatbot function async for token in self.llm.astream(messages): # doesn't work with a LLM that has bind_tools() logger.info(token)Output with LLM that has not bound tools
2025-02-02 21:53:30.840 | INFO | glados.glados:chatbot:46 - messages: [HumanMessage(content='hi', additional_kwargs={}, response_metadata={}, id='d9065e8f-8db4-4ec8-a1a8-0423a332649c')] 2025-02-02 21:53:32.585 | INFO | glados.glados:chatbot:50 - content='How' additional_kwargs={} response_metadata={} id='run-0277b3e0-1ec9-43fe-88b6-0386951b6112' 2025-02-02 21:53:32.599 | INFO | glados.glados:chatbot:50 - content="'s" additional_kwargs={} response_metadata={} id='run-0277b3e0-1ec9-43fe-88b6-0386951b6112' 2025-02-02 21:53:32.608 | INFO | glados.glados:chatbot:50 - content=' it' additional_kwargs={} response_metadata={} id='run-0277b3e0-1ec9-43fe-88b6-0386951b6112' 2025-02-02 21:53:32.617 | INFO | glados.glados:chatbot:50 - content=' going' additional_kwargs={} response_metadata={} id='run-0277b3e0-1ec9-43fe-88b6-0386951b6112' 2025-02-02 21:53:32.626 | INFO | glados.glados:chatbot:50 - content='?' additional_kwargs={} response_metadata={} id='run-0277b3e0-1ec9-43fe-88b6-0386951b6112' 2025-02-02 21:53:32.626 | INFO | glados.glados:send_to_tts:66 - 🔊 Sending to TTS: How 's it going ?The output when tools have been bound.
User: hi 2025-02-02 21:54:44.004 | INFO | glados.glados:chatbot:46 - messages: [HumanMessage(content='hi', additional_kwargs={}, response_metadata={}, id='41ea7fbd-1bee-4d0f-a0c6-fa82ea8f4227')] 2025-02-02 21:54:44.417 | INFO | glados.glados:chatbot:50 - content='' additional_kwargs={} response_metadata={} id='run-5cf7bb0b-1744-484d-82e3-6680a7755c32' tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'hi'}, 'id': '5e76b24e-e324-4635-b6c2-ceb08a150327', 'type': 'tool_call'}] tool_call_chunks=[{'name': 'tavily_search_results_json', 'args': '{"query": "hi"}', 'id': '5e76b24e-e324-4635-b6c2-ceb08a150327', 'index': None, 'type': 'tool_call_chunk'}] User: 2025-02-02 21:54:44.425 | INFO | glados.glados:chatbot:50 - content='' additional_kwargs={} response_metadata={'model': 'llama3.1', 'created_at': '2025-02-02T20:54:44.4251252Z', 'done': True, 'done_reason': 'stop', 'total_duration': 413467300, 'load_duration': 25397300, 'prompt_eval_count': 189, 'prompt_eval_duration': 86000000, 'eval_count': 21, 'eval_duration': 300000000, 'message': Message(role='assistant', content='', images=None, tool_calls=None)} id='run-5cf7bb0b-1744-484d-82e3-6680a7755c32' usage_metadata={'input_tokens': 189, 'output_tokens': 21, 'total_tokens': 210}The entire class
import asyncio from typing import Annotated from langchain_community.tools import TavilySearchResults from langchain_ollama import ChatOllama from langgraph.graph import StateGraph from langgraph.graph.message import add_messages from langgraph.prebuilt import ToolNode, create_react_agent, tools_condition from loguru import logger from typing_extensions import TypedDict tavily_search = TavilySearchResults(max_results=2) class State(TypedDict): messages: Annotated[list, add_messages] class GlaDOS: def __init__(self): self.graph_builder = StateGraph(State) # this can be used with astreams, if it DOESNT have tools bound self.llm = ChatOllama( model="llama3.1", base_url="http://192.168.0.14:11434/", temperature=0.0, seed=42, stream=True ).bind_tools([tavily_search]) # remove this to get the LLM to work, but without tools :/ tool_node = ToolNode([tavily_search]) self.graph_builder.add_node("chatbot", self.chatbot) self.graph_builder.add_node("tools", tool_node) self.graph_builder.add_conditional_edges("chatbot", tools_condition) self.graph_builder.add_edge("tools", "chatbot") self.graph_builder.set_entry_point("chatbot") self.graph = self.graph_builder.compile() print(self.graph.get_graph().draw_mermaid()) async def chatbot(self, state: State): """Asynchronous chatbot function that streams responses and sends partial updates to TTS.""" messages = state["messages"] logger.info(f"messages: {messages}") accumulated_text = "" async for token in self.llm.astream(messages): # doesn't work with a LLM that has bind_tools() logger.info(token) if hasattr(token, "content"): token_text = token.content.strip() logger.debug(f"Streaming Token: {token_text}") accumulated_text += token_text + " " yield {"messages": [{"role": "assistant", "content": token_text}]} if len(accumulated_text) > 40 or "." in token_text or "!" in token_text or "?" in token_text: self.send_to_tts(accumulated_text.strip()) accumulated_text = "" # Reset buffer after sending if accumulated_text: self.send_to_tts(accumulated_text.strip()) def send_to_tts(self, text: str): """Send final streamed response to TTS for speaking.""" if text.strip(): logger.info(f"🔊 Sending to TTS: {text}") glados = GlaDOS() async def stream_graph_updates(user_input: str): """ Asynchronously streams responses from LangGraph """ async for event, metadata in glados.graph.astream( {"messages": [{"role": "user", "content": user_input}]}, stream_mode="messages" ): logger.debug(f"Received event: {event}") # Handle AIMessageChunk directly instead of assuming it's a dict if hasattr(event, "content"): logger.debug(event.content) else: logger.warning(f"Unexpected event type: {type(event)}") async def main(): """ Main loop to interact with the chatbot """ while True: try: user_input = input("User: ") if user_input.lower() in ["quit", "exit", "q"]: print("Goodbye!") break await stream_graph_updates(user_input) except Exception as e: logger.exception(f"Error: {e}") # Run with asyncio event loop if __name__ == '__main__': asyncio.run(main())
I read a paper acknowledging that open source models still lack the ability to call tools for most cases so I guess it's not a problem of this framework or anything related
I read a paper acknowledging that open source models still lack the ability to call tools for most cases so I guess it's not a problem of this framework or anything related
I dont think its a model issue, I'm using a function calling model, and it works when I don't stream responses just fine. I have also used the same model directly via other clients, such as OpenAI with streaming, and it works in those cases.
Its just not working via the langgraph / lanchain clients.
Since OP is noting the issue with astream_events, I just wanted to say its same for me with astream.
I have tried this with other inference providers that offer OpenAI compatible endpoints. I have this issue with groq and also XAI - I think it s a problem with the langgraph implementation
Is this problem solved? I encoutered the similar issue. When I use Claude, I can't get the tool-calling informations!
https://github.com/langchain-ai/langgraph/discussions/4651
It seems to be cause by
astream_eventsmethod. Using langgraph withastreamworks:... async def test(): async for event in app.astream( {"messages": [HumanMessage(content="what is the weather in sf")]}, config={"configurable": {"thread_id": 42}}, ): print(event)
asyncio.run(test()) {'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_rs3ykbgl', 'function': {'arguments': '{"query":"sf weather"}', 'name': 'search'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 147, 'total_tokens': 164}, 'model_name': 'llama3.1', 'system_fingerprint': 'fp_ollama', 'finish_reason': 'stop', 'logprobs': None}, id='run-3175464b-50ce-4fa7-afc7-73fe14cf92ee-0', tool_calls=[{'name': 'search', 'args': {'query': 'sf weather'}, 'id': 'call_rs3ykbgl', 'type': 'tool_call'}], usage_metadata={'input_tokens': 147, 'output_tokens': 17, 'total_tokens': 164})]}} {'tools': {'messages': [ToolMessage(content="It's 60 degrees and foggy.", name='search', tool_call_id='call_rs3ykbgl')]}} {'agent': {'messages': [AIMessage(content='Based on the tool call response, I can format an answer to your original question:\n\nThe current weather in San Francisco (SF) is 60 degrees with fog.', response_metadata={'token_usage': {'completion_tokens': 34, 'prompt_tokens': 86, 'total_tokens': 120}, 'model_name': 'llama3.1', 'system_fingerprint': 'fp_ollama', 'finish_reason': 'stop', 'logprobs': None}, id='run-0c048f94-63ae-45d1-9808-4083dd65ec0d-0', usage_metadata={'input_tokens': 86, 'output_tokens': 34, 'total_tokens': 120})]}}
It doesn't work for me...
I ran the repro with head of the tree langgraph and was not able to reproduce the issue.
The tool call seems to work properly. Please find the snippet below:
{
"type": "AIMessage",
"content": "",
"additional_kwargs": {
"tool_calls": [
{
"index": 0,
"id": "call_<id>",
"function": {
"arguments": "{\"query\":\"sf weather\"}",
"name": "search"
},
"type": "function"
}
]
},
"response_metadata": {
"finish_reason": "tool_calls",
"model_name": "llama3.1",
"system_fingerprint": "fp_ollama"
},
"id": "run--<run_id>",
"tool_calls": [
{
"name": "search",
"args": {
"query": "sf weather"
},
"id": "call_<id>",
"type": "tool_call"
}
]
},
{
"type": "ToolMessage",
"content": "It's 60 degrees and foggy.",
"name": "search",
"id": "<id>",
"tool_call_id": "call_<id>"
}
Did have to make couple of changes from the repro specified at the beginning of the issue:
- Use InMemorySaver instead of MemorySaver
- Had to give a dummy api key with ChatOpenAI
but those changes seem orthogonal to making tool calling to work.
@StreetLamb: Are you still facing the issue? CC: @sydney-runkle