langchain
langchain copied to clipboard
Asking for user input as tool for agents
Hi, love langchain as it's really boosted getting my llm project up to speed. I have one question though:
tldr: Is there a way of enabling an agent to ask a user for input as an intermediate step? like including in the list of tools one "useful for asking for missing information", however with the important difference that the user should act as the oracle and not an llm. Then, the agent could use the user's input to decide next steps.
For context: I am trying to use langchain to design an applications that supports users with solving a complex problem. To do so, I need to get ask the user a number of simple questions relatively early in the process, as depending on the input the UX flow will change, the prompts, etc. Now, I can of course hardcode these questions but from my understanding ideally I'd write an agent that I ask with identifying the relevant pieces of information and then having the agent make the right choices based on the input.
Trivial to implement (even without the help of GPT;)
Thanks for the reply @pannous , I don't see how to do it. Any hints?
sorry, I agree with you. after investigating the agent folder: it is pretty unstructured spaghetti code … no offense to the creators, it probably just grew organically and wasn't refactored for some time. the documentation is not very helpful on how to simply implement and register some no new agent classes, probably because there is no such clean mechanism yet. but there are open tickets for registering agents via annotation.
I've managed to do this after a lots of trials and errors and here is my solution:
First I override the ConversationalChatAgent
import re
CONTEXT_PATTERN = re.compile(r"^CONTEXT:")
class ConversationalChatAgentContext(ConversationalChatAgent):
"""
An agent designed to hold a conversation in addition to using tools.
This agent can ask for context from the user. To ask for context, tools have to return a prefix 'CONTEXT:' followed by the context question.
"""
@property
def _agent_type(self) -> str:
raise NotImplementedError
def _construct_scratchpad(
self, intermediate_steps: List[Tuple[AgentAction, str]]
) -> List[BaseMessage]:
"""Construct the scratchpad that lets the agent continue its thought process."""
thoughts: List[BaseMessage] = []
for action, observation in intermediate_steps:
thoughts.append(AIMessage(content=action.log))
if re.match(CONTEXT_PATTERN, observation):
# remove the context_prefix from the observation
# This is required to avoid the 'TEMPLATE_TOOL_RESPONSE' format response.
human_message = HumanMessage(
content=re.sub(CONTEXT_PATTERN, "", observation)
)
else:
human_message = HumanMessage(
content=TEMPLATE_TOOL_RESPONSE.format(observation=observation)
)
thoughts.append(human_message)
return thoughts
Second, I build a multi input tool (https://langchain.readthedocs.io/en/latest/modules/agents/examples/multi_input_tool.html). In the tool description, I ask to enter a specific token if GPT doesn't have context i-e "If you don't know the product name, you must enter 0."
.
In _run() of the tool, I return with the CONTEXT prompt if it receives the specific token.
def _run(self, tool_input: str) -> str:
"""Use the tool."""
CONTEXT_PROMPT = "CONTEXT:You must ask the human about {context}. Reply with schema #2."
if isinstance(tool_input, str):
product_tag, firmware_version, query = tool_input.split(",")
if product_tag == "0":
return CONTEXT_PROMPT.format(context="the product name")
if firmware_version == "0":
return CONTEXT_PROMPT.format(context="the firmware version")
Finally run the AgentExecutor with ConversationalChatAgentContext exactly like ConversationalChatAgent:
chat_llm = ChatOpenAI(client=None, model_kwargs={"temperature": 0}, model_name="gpt-3.5-turbo")
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
qa_agent = ConversationalChatAgentContext.from_llm_and_tools(
chat_llm, tools, memory=memory, system_message=SYSTEM_PROMPT)
agent = AgentExecutor.from_agent_and_tools(agent=qa_agent, tools=tools, verbose=verbose, memory=memory)
agent.run(input)
Basically, this changes the tool response prompt if the tool returns with the "CONTEXT:" prefix
EDIT 01/24/2024: the multi input tool doesn't exist anymore. But the idea can still work today: force the tool to return with a specific prompt to ask for more context.
Hi @cprevot93 , thank you so much for the above info.
When I try instantiating my agent, I get the following error message:
ValueError: ConversationalChatAgentContext does not support multi-input tool <My Tool Name Here>
Could you please give me some pointers on how you overcame this issue?
This functionality has in the meantime been added (see above)
I've managed to do this after a lots of trials and errors and here is my solution: First I override the
ConversationalChatAgent
import re CONTEXT_PATTERN = re.compile(r"^CONTEXT:") class ConversationalChatAgentContext(ConversationalChatAgent): """ An agent designed to hold a conversation in addition to using tools. This agent can ask for context from the user. To ask for context, tools have to return a prefix 'CONTEXT:' followed by the context question. """ @property def _agent_type(self) -> str: raise NotImplementedError def _construct_scratchpad( self, intermediate_steps: List[Tuple[AgentAction, str]] ) -> List[BaseMessage]: """Construct the scratchpad that lets the agent continue its thought process.""" thoughts: List[BaseMessage] = [] for action, observation in intermediate_steps: thoughts.append(AIMessage(content=action.log)) if re.match(CONTEXT_PATTERN, observation): # remove the context_prefix from the observation human_message = HumanMessage( content=re.sub(CONTEXT_PATTERN, "", observation) ) else: human_message = HumanMessage( content=TEMPLATE_TOOL_RESPONSE.format(observation=observation) ) thoughts.append(human_message) return thoughts
Second, I build a multi input tool (https://langchain.readthedocs.io/en/latest/modules/agents/examples/multi_input_tool.html). In the tool description, I ask to enter a specific token if GPT doesn't have context i-e
"If you don't know the product name, you must enter 0."
. In _run() of the tool, I return with the CONTEXT prompt if it receives the specific token.def _run(self, tool_input: str) -> str: """Use the tool.""" CONTEXT_PROMPT = "CONTEXT:You must ask the human about {context}. Reply with schema #2." if isinstance(tool_input, str): product_tag, firmware_version, query = tool_input.split(",") if product_tag == "0": return CONTEXT_PROMPT.format(context="the product name") if firmware_version == "0": return CONTEXT_PROMPT.format(context="the firmware version")
Finally run the AgentExecutor with ConversationalChatAgentContext exactly like ConversationalChatAgent:
chat_llm = ChatOpenAI(client=None, model_kwargs={"temperature": 0}, model_name="gpt-3.5-turbo") memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True) qa_agent = ConversationalChatAgentContext.from_llm_and_tools( chat_llm, tools, memory=memory, system_message=SYSTEM_PROMPT) agent = AgentExecutor.from_agent_and_tools(agent=qa_agent, tools=tools, verbose=verbose, memory=memory) agent.run(input)
Basically, this changes the tool response prompt if the tool returns with the "CONTEXT:" prefix
The link to the multi_input_tool does not work any more. can you please update it?