promptflow icon indicating copy to clipboard operation
promptflow copied to clipboard

[BUG] "(TypeError) expected string or buffer" when running a batch run a flow with langchain python node

Open loomlike opened this issue 10 months ago • 3 comments

Describe the bug When using a langchain node with a flow, batch run fails with the following error at the langchain node:

pf.run(
    flow=flow_path,
    data=data_path,  # data.jsonl path
)

# flow fails with `(TypeError) expected string or buffer` at the langchain node. If I have 20 lines in the data.jsonl, I see 20/20 failed and (TypeError) expected string or buffer errors for each.

If I run a flow with a single input json, it runs fine.

pf.test(flow=flow_path, inputs=one_line_of_data_jsonl)

# flow runs without error

How To Reproduce the bug Steps to reproduce the behavior, how frequent can you experience the bug:

  1. create a langchain node in a flow
  2. make data.jsonl to test
  3. run batch run the flow

Expected behavior Flow with a langchain node runs without errors.

Screenshots

 line_number: 1
        error:
          message: "Execution failure in 'langchain_node': (TypeError) expected string
            or buffer"
          messageFormat: Execution failure in '{node_name}'.
          messageParameters:
            node_name: langchain_node
          referenceCode: Tool/__pf_main__
          code: UserError
          innerError:
            code: ToolExecutionError
            innerError:
          additionalInfo:
          - type: ToolExecutionErrorDetails
            info:
              type: TypeError
              message: expected string or buffer
              traceback: "Traceback (most recent call last):\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/_trace.py\"\
                , line 380, in wrapped\n    output = func(*args, **kwargs)\n  File
                \"/home/junkimin/rag-patterns/flows/relational_data/langchain_node.py\"\
                , line 71, in langchain_tool\n    res = agent_executor.invoke({\n\
                \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/chains/base.py\"\
                , line 163, in invoke\n    raise e\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/chains/base.py\"\
                , line 153, in invoke\n    self._call(inputs, run_manager=run_manager)\n\
                \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/agents/agent.py\"\
                , line 1432, in _call\n    next_step_output = self._take_next_step(\n\
                \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/agents/agent.py\"\
                , line 1138, in _take_next_step\n    [\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/agents/agent.py\"\
                , line 1138, in <listcomp>\n    [\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/agents/agent.py\"\
                , line 1166, in _iter_next_step\n    output = self.agent.plan(\n \
                \ File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/agents/agent.py\"\
                , line 397, in plan\n    for chunk in self.runnable.stream(inputs,
                config={\"callbacks\": callbacks}):\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 2685, in stream\n    yield from self.transform(iter([input]),
                config, **kwargs)\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 2672, in transform\n    yield from self._transform_stream_with_config(\n\
                \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 1743, in _transform_stream_with_config\n    chunk: Output =
                context.run(next, iterator)  # type: ignore\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 2636, in _transform\n    for output in final_pipeline:\n  File
                \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 1209, in transform\n    for chunk in input:\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 4532, in transform\n    yield from self.bound.transform(\n\
                \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 1226, in transform\n    yield from self.stream(final, config,
                **kwargs)\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py\"\
                , line 239, in stream\n    raise e\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py\"\
                , line 222, in stream\n    for chunk in self._stream(\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_openai/chat_models/base.py\"\
                , line 395, in _stream\n    for chunk in self.client.create(messages=message_dicts,
                **params):\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/contracts/generator_proxy.py\"\
                , line 27, in generate_from_proxy\n    yield from proxy\n  File \"\
                /home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/contracts/generator_proxy.py\"\
                , line 17, in __next__\n    item = next(self._generator)\n  File \"\
                /home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/_trace.py\"\
                , line 168, in traced_generator\n    token_collector.collect_openai_tokens_for_streaming(span,
                inputs, generator_output, parser.is_chat)\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/_trace.py\"\
                , line 50, in collect_openai_tokens_for_streaming\n    tokens = calculator.get_openai_metrics_for_chat_api(inputs,
                output)\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/_openai_utils.py\"\
                , line 102, in get_openai_metrics_for_chat_api\n    metrics[\"prompt_tokens\"\
                ] = self._get_prompt_tokens_from_messages(\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/_openai_utils.py\"\
                , line 138, in _get_prompt_tokens_from_messages\n    prompt_tokens
                += len(enc.encode(value))\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/tiktoken/core.py\"\
                , line 116, in encode\n    if match := _special_token_regex(disallowed_special).search(text):\n
                TypeError: expected string or buffer\n"
              filename: 
                /home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/tiktoken/core.py
              lineno: 116
              name: encode
          debugInfo:
            type: ToolExecutionError
            message: "Execution failure in 'langchain_node': (TypeError) expected
              string or buffer"
            stackTrace: "\nThe above exception was the direct cause of the following
              exception:\n\nTraceback (most recent call last):\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/executor/flow_executor.py\"\
              , line 926, in _exec\n    output, aggregation_inputs = self._exec_inner_with_trace(\n\
              \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/executor/flow_executor.py\"\
              , line 836, in _exec_inner_with_trace\n    output, aggregation_inputs
              = self._exec_inner(\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/executor/flow_executor.py\"\
              , line 865, in _exec_inner\n    output, nodes_outputs = self._traverse_nodes(inputs,
              context)\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/executor/flow_executor.py\"\
              , line 1106, in _traverse_nodes\n    nodes_outputs, bypassed_nodes =
              self._submit_to_scheduler(context, inputs, batch_nodes)\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/executor/flow_executor.py\"\
              , line 1143, in _submit_to_scheduler\n    return scheduler.execute(self._line_timeout_sec)\n\
              \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/executor/_flow_nodes_scheduler.py\"\
              , line 89, in execute\n    raise e\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/executor/_flow_nodes_scheduler.py\"\
              , line 72, in execute\n    self._dag_manager.complete_nodes(self._collect_outputs(completed_futures))\n\
              \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/executor/_flow_nodes_scheduler.py\"\
              , line 113, in _collect_outputs\n    each_node_result = each_future.result()\n\
              \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/concurrent/futures/_base.py\"\
              , line 451, in result\n    return self.__get_result()\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/concurrent/futures/_base.py\"\
              , line 403, in __get_result\n    raise self._exception\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/concurrent/futures/thread.py\"\
              , line 58, in run\n    result = self.fn(*self.args, **self.kwargs)\n\
              \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/executor/_flow_nodes_scheduler.py\"\
              , line 134, in _exec_single_node_in_thread\n    result = context.invoke_tool(node,
              f, kwargs=kwargs)\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/_core/flow_execution_context.py\"\
              , line 87, in invoke_tool\n    result = self._invoke_tool_inner(node,
              f, kwargs)\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/_core/flow_execution_context.py\"\
              , line 202, in _invoke_tool_inner\n    raise ToolExecutionError(node_name=node_name,
              module=module) from e\n"
            innerException:
              type: TypeError
              message: expected string or buffer
              stackTrace: "Traceback (most recent call last):\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/_core/flow_execution_context.py\"\
                , line 178, in _invoke_tool_inner\n    return f(**kwargs)\n  File
                \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/_trace.py\"\
                , line 380, in wrapped\n    output = func(*args, **kwargs)\n  File
                \"/home/junkimin/rag-patterns/flows/relational_data/langchain_node.py\"\
                , line 71, in langchain_tool\n    res = agent_executor.invoke({\n\
                \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/chains/base.py\"\
                , line 163, in invoke\n    raise e\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/chains/base.py\"\
                , line 153, in invoke\n    self._call(inputs, run_manager=run_manager)\n\
                \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/agents/agent.py\"\
                , line 1432, in _call\n    next_step_output = self._take_next_step(\n\
                \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/agents/agent.py\"\
                , line 1138, in _take_next_step\n    [\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/agents/agent.py\"\
                , line 1138, in <listcomp>\n    [\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/agents/agent.py\"\
                , line 1166, in _iter_next_step\n    output = self.agent.plan(\n \
                \ File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain/agents/agent.py\"\
                , line 397, in plan\n    for chunk in self.runnable.stream(inputs,
                config={\"callbacks\": callbacks}):\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 2685, in stream\n    yield from self.transform(iter([input]),
                config, **kwargs)\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 2672, in transform\n    yield from self._transform_stream_with_config(\n\
                \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 1743, in _transform_stream_with_config\n    chunk: Output =
                context.run(next, iterator)  # type: ignore\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 2636, in _transform\n    for output in final_pipeline:\n  File
                \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 1209, in transform\n    for chunk in input:\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 4532, in transform\n    yield from self.bound.transform(\n\
                \  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/runnables/base.py\"\
                , line 1226, in transform\n    yield from self.stream(final, config,
                **kwargs)\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py\"\
                , line 239, in stream\n    raise e\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py\"\
                , line 222, in stream\n    for chunk in self._stream(\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/langchain_openai/chat_models/base.py\"\
                , line 395, in _stream\n    for chunk in self.client.create(messages=message_dicts,
                **params):\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/contracts/generator_proxy.py\"\
                , line 27, in generate_from_proxy\n    yield from proxy\n  File \"\
                /home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/contracts/generator_proxy.py\"\
                , line 17, in __next__\n    item = next(self._generator)\n  File \"\
                /home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/_trace.py\"\
                , line 168, in traced_generator\n    token_collector.collect_openai_tokens_for_streaming(span,
                inputs, generator_output, parser.is_chat)\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/_trace.py\"\
                , line 50, in collect_openai_tokens_for_streaming\n    tokens = calculator.get_openai_metrics_for_chat_api(inputs,
                output)\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/_openai_utils.py\"\
                , line 102, in get_openai_metrics_for_chat_api\n    metrics[\"prompt_tokens\"\
                ] = self._get_prompt_tokens_from_messages(\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/promptflow/tracing/_openai_utils.py\"\
                , line 138, in _get_prompt_tokens_from_messages\n    prompt_tokens
                += len(enc.encode(value))\n  File \"/home/junkimin/miniconda3/envs/openai/lib/python3.10/site-packages/tiktoken/core.py\"\
                , line 116, in encode\n    if match := _special_token_regex(disallowed_special).search(text):\n"
              innerException:
  debugInfo:
    type: BulkRunException
    message: "Failed to run 2/2 lines. First error message is: Execution failure in
      'langchain_node': (TypeError) expected string or buffer"
    stackTrace: "Traceback (most recent call last):\n"
    innerException:

Running Information(please complete the following information):

  • Promptflow Package Version using pf -v: 1.7.0
  • Langchain version: 0.1.13
  • Operating System: Ubuntu
  • Python Version using python --version: 3.10

loomlike avatar Apr 06 '24 00:04 loomlike

FYI, I dug into the error and found where the issue is coming from: promptflow.tracing._openai_utils.py.OpenAIMetricsCalculator._get_prompt_tokens_from_messages, specifically at prompt_tokens += len(enc.encode(value)). If I commented out this, the batch run works.

I'm using langchain's Agent API and here is my python node codes for you to debug.

from typing import Dict

from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_benchmarks import registry
from langchain_core.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate, MessagesPlaceholder, PromptTemplate
from langchain_openai import AzureChatOpenAI

from promptflow.core import tool
from promptflow.connections import AzureOpenAIConnection

@tool
def langchain_tool(
    question: str,
    prompt_template: str,
    chat_model: str,
    conn: AzureOpenAIConnection,
) -> Dict:
    """
    A tool that uses the language model to answer a question based on a prompt template
    
    Args:
        question: User question
        prompt_template: Template for the prompt
        chat_model: Deployment name to use
        conn: Azure OpenAI connection object
    """
    prompt = ChatPromptTemplate.from_messages([
        SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template=prompt_template)),
        HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['question'], template='{question}')),
        MessagesPlaceholder(variable_name='agent_scratchpad')
    ])

    llm = AzureChatOpenAI(
        model=chat_model,
        temperature=0,
        azure_endpoint=conn.api_base,
        api_version=conn.api_version,
        api_key=conn.api_key,
    )

    task = registry["Tool Usage - Relational Data"]
    env = task.create_environment()
    tools = env.tools

    agent = create_openai_functions_agent(llm=llm, tools=tools, prompt=prompt)
    agent_executor = AgentExecutor(
        agent=agent,
        tools=tools,
        verbose=True,
        handle_parsing_errors=True,
        return_intermediate_steps=True,
    )

    res = agent_executor.invoke({
        "question": question,
    })
    # We only want to know the step (tool) names
    res['intermediate_steps'] = [agent_action[0].tool for agent_action in res['intermediate_steps']]

    return res

loomlike avatar Apr 09 '24 08:04 loomlike

@loomlike Thank you for your information, it is a bug in such scenario, we will fix it.

thy09 avatar Apr 10 '24 05:04 thy09

Hi, we're sending this friendly reminder because we haven't heard back from you in 30 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 7 days of this comment, the issue will be automatically closed. Thank you!

github-actions[bot] avatar May 10 '24 21:05 github-actions[bot]