langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Asking for user input as tool for agents

Open pboes opened this issue 1 year ago • 4 comments

Hi, love langchain as it's really boosted getting my llm project up to speed. I have one question though:

tldr: Is there a way of enabling an agent to ask a user for input as an intermediate step? like including in the list of tools one "useful for asking for missing information", however with the important difference that the user should act as the oracle and not an llm. Then, the agent could use the user's input to decide next steps.

For context: I am trying to use langchain to design an applications that supports users with solving a complex problem. To do so, I need to get ask the user a number of simple questions relatively early in the process, as depending on the input the UX flow will change, the prompts, etc. Now, I can of course hardcode these questions but from my understanding ideally I'd write an agent that I ask with identifying the relevant pieces of information and then having the agent make the right choices based on the input.

pboes avatar Mar 04 '23 17:03 pboes

Trivial to implement (even without the help of GPT;)

pannous avatar Mar 07 '23 10:03 pannous

Thanks for the reply @pannous , I don't see how to do it. Any hints?

pboes avatar Mar 07 '23 10:03 pboes

sorry, I agree with you. after investigating the agent folder: it is pretty unstructured spaghetti code … no offense to the creators, it probably just grew organically and wasn't refactored for some time. the documentation is not very helpful on how to simply implement and register some no new agent classes, probably because there is no such clean mechanism yet. but there are open tickets for registering agents via annotation.

pannous avatar Mar 07 '23 10:03 pannous

I've managed to do this after a lots of trials and errors and here is my solution: First I override the ConversationalChatAgent

import re
CONTEXT_PATTERN = re.compile(r"^CONTEXT:")

class ConversationalChatAgentContext(ConversationalChatAgent):
    """
    An agent designed to hold a conversation in addition to using tools.
    This agent can ask for context from the user. To ask for context, tools have to return a prefix 'CONTEXT:' followed by the context question.
    """
    @property
    def _agent_type(self) -> str:
        raise NotImplementedError

    def _construct_scratchpad(
        self, intermediate_steps: List[Tuple[AgentAction, str]]
    ) -> List[BaseMessage]:
        """Construct the scratchpad that lets the agent continue its thought process."""
        thoughts: List[BaseMessage] = []
        for action, observation in intermediate_steps:
            thoughts.append(AIMessage(content=action.log))
            if re.match(CONTEXT_PATTERN, observation):
                # remove the context_prefix from the observation
                # This is required to avoid the 'TEMPLATE_TOOL_RESPONSE' format response.
                human_message = HumanMessage(
                    content=re.sub(CONTEXT_PATTERN, "", observation)
                )
            else:
                human_message = HumanMessage(
                    content=TEMPLATE_TOOL_RESPONSE.format(observation=observation)
                )
            thoughts.append(human_message)
        return thoughts

Second, I build a multi input tool (https://langchain.readthedocs.io/en/latest/modules/agents/examples/multi_input_tool.html). In the tool description, I ask to enter a specific token if GPT doesn't have context i-e "If you don't know the product name, you must enter 0.". In _run() of the tool, I return with the CONTEXT prompt if it receives the specific token.

    def _run(self, tool_input: str) -> str:
        """Use the tool."""
        CONTEXT_PROMPT = "CONTEXT:You must ask the human about {context}. Reply with schema #2."
        if isinstance(tool_input, str):
            product_tag, firmware_version, query = tool_input.split(",")

        if product_tag == "0":
            return CONTEXT_PROMPT.format(context="the product name")
        if firmware_version == "0":
            return CONTEXT_PROMPT.format(context="the firmware version")

Finally run the AgentExecutor with ConversationalChatAgentContext exactly like ConversationalChatAgent:

    chat_llm = ChatOpenAI(client=None, model_kwargs={"temperature": 0}, model_name="gpt-3.5-turbo")
    memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
    qa_agent = ConversationalChatAgentContext.from_llm_and_tools(
        chat_llm, tools, memory=memory, system_message=SYSTEM_PROMPT)
    agent = AgentExecutor.from_agent_and_tools(agent=qa_agent, tools=tools, verbose=verbose, memory=memory)
    agent.run(input)

Basically, this changes the tool response prompt if the tool returns with the "CONTEXT:" prefix

EDIT 01/24/2024: the multi input tool doesn't exist anymore. But the idea can still work today: force the tool to return with a specific prompt to ask for more context.

cprevot93 avatar Mar 21 '23 16:03 cprevot93

Hi @cprevot93 , thank you so much for the above info.

When I try instantiating my agent, I get the following error message: ValueError: ConversationalChatAgentContext does not support multi-input tool <My Tool Name Here>

Could you please give me some pointers on how you overcame this issue?

ksshetty avatar May 11 '23 15:05 ksshetty

This functionality has in the meantime been added (see above)

pboes avatar May 11 '23 15:05 pboes

I've managed to do this after a lots of trials and errors and here is my solution: First I override the ConversationalChatAgent

import re
CONTEXT_PATTERN = re.compile(r"^CONTEXT:")

class ConversationalChatAgentContext(ConversationalChatAgent):
    """
    An agent designed to hold a conversation in addition to using tools.
    This agent can ask for context from the user. To ask for context, tools have to return a prefix 'CONTEXT:' followed by the context question.
    """
    @property
    def _agent_type(self) -> str:
        raise NotImplementedError

    def _construct_scratchpad(
        self, intermediate_steps: List[Tuple[AgentAction, str]]
    ) -> List[BaseMessage]:
        """Construct the scratchpad that lets the agent continue its thought process."""
        thoughts: List[BaseMessage] = []
        for action, observation in intermediate_steps:
            thoughts.append(AIMessage(content=action.log))
            if re.match(CONTEXT_PATTERN, observation):
                # remove the context_prefix from the observation
                human_message = HumanMessage(
                    content=re.sub(CONTEXT_PATTERN, "", observation)
                )
            else:
                human_message = HumanMessage(
                    content=TEMPLATE_TOOL_RESPONSE.format(observation=observation)
                )
            thoughts.append(human_message)
        return thoughts

Second, I build a multi input tool (https://langchain.readthedocs.io/en/latest/modules/agents/examples/multi_input_tool.html). In the tool description, I ask to enter a specific token if GPT doesn't have context i-e "If you don't know the product name, you must enter 0.". In _run() of the tool, I return with the CONTEXT prompt if it receives the specific token.

    def _run(self, tool_input: str) -> str:
        """Use the tool."""
        CONTEXT_PROMPT = "CONTEXT:You must ask the human about {context}. Reply with schema #2."
        if isinstance(tool_input, str):
            product_tag, firmware_version, query = tool_input.split(",")

        if product_tag == "0":
            return CONTEXT_PROMPT.format(context="the product name")
        if firmware_version == "0":
            return CONTEXT_PROMPT.format(context="the firmware version")

Finally run the AgentExecutor with ConversationalChatAgentContext exactly like ConversationalChatAgent:

    chat_llm = ChatOpenAI(client=None, model_kwargs={"temperature": 0}, model_name="gpt-3.5-turbo")
    memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
    qa_agent = ConversationalChatAgentContext.from_llm_and_tools(
        chat_llm, tools, memory=memory, system_message=SYSTEM_PROMPT)
    agent = AgentExecutor.from_agent_and_tools(agent=qa_agent, tools=tools, verbose=verbose, memory=memory)
    agent.run(input)

Basically, this changes the tool response prompt if the tool returns with the "CONTEXT:" prefix

The link to the multi_input_tool does not work any more. can you please update it?

wgzmaxik6 avatar Jul 21 '23 18:07 wgzmaxik6