promptflow
promptflow copied to clipboard
[BUG] Running Prompt flow locally produces errors
Describe the bug
Running a flow using pf test seems to work but exceptions are reported locally.
How To Reproduce the bug
Run pf test on a flow locally. The flow executes successfully but exceptions are generated when collecting token metrics for openai.
Expected behavior A clean run without exceptions
Screenshots
WARNING:opentelemetry.attributes:Invalid type NoneType for attribute 'computed.cumulative_token_count.completion' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute 'llm.usage.completion_tokens_details' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute 'computed.cumulative_token_count.completion' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types
Running Information(please complete the following information): { "promptflow": "1.15.0", "promptflow-azure": "1.15.0", "promptflow-core": "1.15.0", "promptflow-devkit": "1.15.0", "promptflow-tracing": "1.15.0" }
Executable 'c:\git\azure-ai-prompt-flow.venv\Scripts\python.exe' Python (Windows) 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] Additional context Seems like the exception is in here
def collect_openai_tokens_for_parent_span(self, span):
tokens = self.try_get_openai_tokens(span.get_span_context().span_id)
if tokens:
if not hasattr(span, "parent") or span.parent is None:
return
parent_span_id = span.parent.span_id
with self._lock:
if parent_span_id in self._span_id_to_tokens:
merged_tokens = {
key: self._span_id_to_tokens[parent_span_id].get(key, 0) + tokens.get(key, 0)
for key in set(self._span_id_to_tokens[parent_span_id]) | set(tokens)
}
self._span_id_to_tokens[parent_span_id] = merged_tokens
else:
self._span_id_to_tokens[parent_span_id] = tokens
On the line key: self._span_id_to_tokens[parent_span_id].get(key, 0) + tokens.get(key, 0) . Not all steps in my flow are LLM steps. I wonder if this could be causing the issue where we have some steps in the flow that don't have any tokens.
We see something similar, however in our case it is an error and failing the execution:
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/promptflow/tracing/_trace.py", line 144, in <dictcomp>
key: self._span_id_to_tokens[parent_span_id].get(key, 0) + tokens.get(key, 0)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType'
Seeing similar errors on my end, this is halting flow execution:
2024-09-13 14:04:35 -0400 46456 execution.flow WARNING Failed to calculate metrics due to exception: unsupported operand type(s) for +: 'int' and 'dict'.
Same issue, resulting in flow being terminated:
WARNING:opentelemetry.attributes:Invalid type dict for attribute '__computed__.cumulative_token_count.completion' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types 2024-09-13 16:53:20 -0400 28812 execution.flow WARNING Failed to calculate metrics due to exception: unsupported operand type(s) for +: 'int' and 'dict'. 2024-09-13 16:53:20 -0400 28812 execution.flow ERROR Flow execution has failed. Cancelling all running nodes: extract_data.
Some keys in the tokens have values that are not int, causing the issue.
azure_open_ai: tokens = {'completion_tokens': 13, 'prompt_tokens': 673, 'total_tokens': 686, 'completion_tokens_details': None} open_ai: {'completion_tokens': 13, 'prompt_tokens': 669, 'total_tokens': 682, 'completion_tokens_details': {'reasoning_tokens': 0}}
As a temporary solution, tracing can be disabled by setting PF_DISABLE_TRACING=true.
Rolling back the version of openai also works, openai<=1.44.1 resolves the error. Looks like promptflow-tracing could be incompatible with 1.45.0 https://pypi.org/project/openai/#history
@asos-oliverfrost thank you for finding that! Rolling back worked for me. It looks like this is the specific line that is breaking the behavior in prompt flow, because the new field completion_token_details is optional: https://github.com/openai/openai-python/compare/v1.44.1...v1.45.0#diff-d85f41ac9f419751206af46c34ef5c8c74258660be492aa703dcbebcfc96a41bR25
Rolling back OpenAI did not work for me, Pydantic is so dynamic that I still get completion_tokens_details={'reasoning_tokens': 0}. In summary:
- An OpenAI endpoint returns a dictionary inside another dictionary: {"completion_token_details":{"foo_bar":0}}.
- Then it's processed by
promflow.tracing, which does not support that.
Also having this issue, with similar errors:
2024-09-19 18:05:38 -0700 89862 execution ERROR Node extract_result in line 0 failed. Exception: Execution failure in 'extract_result': (TypeError) unsupported operand type(s) for +: 'dict' and 'dict'.
Traceback (most recent call last):
File "/[...]/pypoetry/virtualenvs/promptflow-test-gto235yr-py3.11/lib/python3.11/site-packages/promptflow/_core/flow_execution_context.py", line 182, in _invoke_tool_inner
return f(**kwargs)
^^^^^^^^^^^
File "/[...]/pypoetry/virtualenvs/promptflow-test-gto235yr-py3.11/lib/python3.11/site-packages/promptflow/tracing/_trace.py", line 561, in wrapped
token_collector.collect_openai_tokens_for_parent_span(span)
File "/[...]/pypoetry/virtualenvs/promptflow-test-gto235yr-py3.11/lib/python3.11/site-packages/promptflow/tracing/_trace.py", line 143, in collect_openai_tokens_for_parent_span
merged_tokens = {
^
File "/[...]/pypoetry/virtualenvs/promptflow-test-gto235yr-py3.11/lib/python3.11/site-packages/promptflow/tracing/_trace.py", line 144, in <dictcomp>
key: self._span_id_to_tokens[parent_span_id].get(key, 0) + tokens.get(key, 0)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for +: 'dict' and 'dict'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/[...]/pypoetry/virtualenvs/promptflow-test-gto235yr-py3.11/lib/python3.11/site-packages/promptflow/_core/flow_execution_context.py", line 90, in invoke_tool
result = self._invoke_tool_inner(node, f, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/[...]/pypoetry/virtualenvs/promptflow-test-gto235yr-py3.11/lib/python3.11/site-packages/promptflow/_core/flow_execution_context.py", line 206, in _invoke_tool_inner
raise ToolExecutionError(node_name=node_name, module=module) from e
promptflow._core._errors.ToolExecutionError: Execution failure in 'extract_result': (TypeError) unsupported operand type(s) for +: 'dict' and 'dict'
2024-09-19 18:05:38 -0700 89862 execution.flow WARNING Failed to calculate metrics due to exception: unsupported operand type(s) for +: 'int' and 'dict'.
2024-09-19 18:05:38 -0700 89862 execution.flow ERROR Flow execution has failed. Cancelling all running nodes: extract_result.
pf.flow.test failed with UserErrorException: TypeError: Execution failure in 'extract_result': (TypeError) unsupported operand type(s) for +: 'dict' and 'dict'
This happens when running a version of the chat-math-variant example, edited so that extract_text.py calls the OpenAI ChatCompletions endpoint. One of my coworkers is also seeing this error, seemingly from an LLM tool call.
Rolling back the version of openai also works,
openai<=1.44.1resolves the error. Looks like promptflow-tracing could be incompatible with 1.45.0 https://pypi.org/project/openai/#history
This worked for me
oh. @mallapraveen @jomalsan YOu saved me. Thank you millions.
Traceback (most recent call last):
File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/_core/flow_execution_context.py", line 182, in _invoke_tool_inner
return f(**kwargs)
File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/tracing/_trace.py", line 561, in wrapped
token_collector.collect_openai_tokens_for_parent_span(span)
File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/tracing/_trace.py", line 143, in collect_openai_tokens_for_parent_span
merged_tokens = {
File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/tracing/_trace.py", line 144, in
Getting this error.
I'm able to run locally, but I get TypeError: unsupported operand type(s) for +: 'int' and 'NoneType' after deploying using Azure AI Studio
Rolling back the version of openai also works,
openai<=1.44.1resolves the error. Looks like promptflow-tracing could be incompatible with 1.45.0 https://pypi.org/project/openai/#history
This did the trick for me !! thank you so much <3
I have found PR that may be relevant to this issue.
https://github.com/microsoft/promptflow/pull/3793
I'm also noticing this issue recently (was working a month ago) when deploying prompt flow from Azure AI Studio, created from Chat playground.
We encountered the problem as well. Adding the following lines to the promptflow YAML worked:
environment_variables:
PF_DISABLE_TRACING: true
As @JanWerder suggested this did the trick for me
environment_variables:
PF_DISABLE_TRACING: true
This is fixed with promptflow-tracing 1.16.1
@luigiw It does not seem that this fix works with the base docker promptflow runtime image. As the image is still based on Python 3.9
@luigiw after updating to 1.16.1 the error goes away, and that is good, but a lot of warning still persists:
WARNING:opentelemetry.attributes:Invalid type NoneType for attribute '__computed__.cumulative_token_count.completion' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute '__computed__.cumulative_token_count.prompt' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute 'llm.usage.completion_tokens_details' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute 'llm.usage.prompt_tokens_details' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute '__computed__.cumulative_token_count.completion' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute '__computed__.cumulative_token_count.prompt' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types
And what I notice on AppInsights if that the logs for llm tools shows token consumptions (completion, prompt and total); but the "same" metrics for te whole flow shows only cosumption token for total token and "0" for the other two:
Leaf tool nod:
1 level up:
root:
@luigiw after updating to 1.16.1 the error goes away, and that is good, but a lot of warning still persists:
WARNING:opentelemetry.attributes:Invalid type NoneType for attribute '__computed__.cumulative_token_count.completion' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute '__computed__.cumulative_token_count.prompt' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute 'llm.usage.completion_tokens_details' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute 'llm.usage.prompt_tokens_details' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute '__computed__.cumulative_token_count.completion' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute '__computed__.cumulative_token_count.prompt' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those typesAnd what I notice on AppInsights if that the logs for llm tools shows token consumptions (completion, prompt and total); but the "same" metrics for te whole flow shows only cosumption token for total token and "0" for the other two:
Leaf tool nod:
1 level up:
root:
I have found PR that may be relevant to this issue: #3806
Still getting this error service failed to complete the prompt", TypeError("unsupported operand type(s) for +: 'dict' and 'dict'")) 2024-12-11 15:13:53 -0500 36540 execution.flow WARNING Failed to calculate metrics due to exception: unsupported operand type(s) for +: 'int' and 'dict'.
versions: promptflow 1.16.2 promptflow-core 1.16.2 promptflow-devkit 1.16.2 promptflow-tracing 1.16.2
Python 3.11.7
found another similar open issue and add my issue since this issue had been closed.
@redhatpeter same issue here
Is this issue now fixed?
It seems you need you still need to do the following on the YAML file:
environment_variables:
PF_DISABLE_TRACING: true
I use flex flows and was able to fix this bug by overloading the collect_openai_tokens method on promptflow.tracing._trace.token_collector. This allows me to filter to only the "completion_tokens", "prompt_tokens" and "total_tokens", which are always integer responses. With this change in place you can use the latest version of the promptflow and openai packages.
Add the following code to the top of your flex flow.
import inspect
import types
from promptflow.tracing._trace import token_collector
def collect_openai_tokens(self, span, output):
"""Override method for the promptflow.tracing._trace.TokenCollector class
Original implementation is here: https://github.com/microsoft/promptflow/blob/40c84b46f48cc7d02b6188e244e3dd8b0dde4743/src/promptflow-tracing/promptflow/tracing/_trace.py#L111
"""
span_id = span.get_span_context().span_id
if (
not inspect.isgenerator(output)
and hasattr(output, "usage")
and output.usage is not None
):
tokens = output.usage.dict()
if tokens:
# Filter to only specific tokens
tokens = {
k: v if v is not None and type(v) is int else 0
for k, v in tokens.items()
if k in ["completion_tokens", "prompt_tokens", "total_tokens"]
}
with self._lock:
self._span_id_to_tokens[span_id] = tokens
token_collector.collect_openai_tokens = types.MethodType(
collect_openai_tokens, token_collector
)
Also still having this issue. I don't want to give up tracing as it's extremely useful for my development. I'm using prompt flow via the yaml-based declarative approach. I have tried openai==1.44.0, promptflow-tracing==1.16.1
EDIT: Upgrading all promptflow packages to the Jan 6th release, promptflow 1.17.0, fixed the issue for me. The fix is in the release notes. https://pypi.org/project/promptflow/1.17.0/
Also still having this issue. I don't want to give up tracing as it's extremely useful for my development. I'm using prompt flow via the yaml-based declarative approach. I have tried openai==1.44.0, promptflow-tracing==1.16.1
EDIT: Upgrading all promptflow packages to the Jan 6th release, promptflow 1.17.0, fixed the issue for me. The fix is in the release notes. https://pypi.org/project/promptflow/1.17.0/
Nice it worked for me as well, after upgrading the promptflow package
Also still having this issue. I don't want to give up tracing as it's extremely useful for my development. I'm using prompt flow via the yaml-based declarative approach. I have tried openai==1.44.0, promptflow-tracing==1.16.1
EDIT: Upgrading all promptflow packages to the Jan 6th release, promptflow 1.17.0, fixed the issue for me. The fix is in the release notes. https://pypi.org/project/promptflow/1.17.0/
I can confirm it works for me as well with promptflow 1.17.0 . :) Thanks all
in promptflow/tracing/_openai_utils.py
def _get_openai_metrics_for_signal_api(self, api_call: dict):
inputs = api_call.get("inputs")
output = api_call.get("output")
if isinstance(output, dict):
usage = output.get("usage")
if isinstance(usage, dict):
try:
del usage['prompt_tokens_details']
del usage['completion_tokens_details']
except KeyError:
pass
return usage
self._log_warning(
"Cannot find openai metrics in output, will calculate metrics from response data directly."
)
This will remove the dictionary keys with composite metrics and remove the error.
Fixes the current problem but still leaves the issue of detailed metrics etc hidden, which should be visibile.
See https://cookbook.openai.com/examples/prompt_caching101 for why.