openllmetry
openllmetry copied to clipboard
🚀 Feature: log content filter results in proper attributes (Azure OpenAI)
Which component is this feature for?
OpenAI Instrumentation
🔖 Feature description
Azure OpenAI has an option to filter inappropriate content, which returns a slightly different response. While #854 handles this partially, we should explicitly log all the reasons in separate attributes.
🎤 Why is this feature needed ?
✌️ How do you aim to achieve this?
Create one or more attributes to log the reasons content was redacted.
🔄️ Additional Information
No response
👀 Have you spent some time to check if this feature request has been raised before?
- [X] I checked and didn't find similar issue
Are you willing to submit PR?
None
Interested
@dtee1 yes please :) let me know if you have any questions - you're welcome to join our slack as well.
This change definitely has bugs. I updated to the latest version of traceloop-sdk today, 0.16.6, but even though gpt4's responses are normally output to the console without any filtering issues, it still shows FILTERED on traceloop.
According to the Azure official documentation https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/content-filter?tabs=warning%2Cpython-new
data: {"id":"","object":"","created":0,"model":"","prompt_annotations":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"choices":[],"usage":null}
data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"role":"assistant"}}],"usage":null}
data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":"Color"}}],"usage":null}
data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":" is"}}],"usage":null}
data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":" a"}}],"usage":null}
...
data: {"id":"","object":"","created":0,"model":"","choices":[{"index":0,"finish_reason":null,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"content_filter_offsets":{"check_offset":44,"start_offset":44,"end_offset":198}}],"usage":null}
...
data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":"stop","delta":{}}],"usage":null}
data: {"id":"","object":"","created":0,"model":"","choices":[{"index":0,"finish_reason":null,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"content_filter_offsets":{"check_offset":506,"start_offset":44,"end_offset":571}}],"usage":null}
data: [DONE]
I believe you need to check whether the finish_reason is 'stop' or 'content_filter', rather than checking if the content_filter_results is null.
@sukidesuka thanks for flagging! I was lazy and didn't test it properly for Azure. I've now added proper testing specifically for Azure and fixed the issue so this won't happen. See https://github.com/traceloop/openllmetry/pull/886
I can help!
Sounds good @lauradowell! @dtee1, are you still on this?
I'm interested! Is it still open
Yes @jonhsma!
On it! Will ping you on Slack if I have any questions!