haystack
haystack copied to clipboard
Handle incorrect responses from OpenAI in _extract_token method for stream:True
Hello,
In the _extract_token
method of the ChatGPTInvocationLayer
class, there's an assumption that the event_data["choices"]
array will always have at least one item with a delta
key. However, based on observed behavior, OpenAI might provide responses without this expected structure on some preview endpoints. This can lead to unexpected errors.
Steps To Reproduce:
- Use an OpenAI endpoint that might return a response without the expected structure (e.g., my chunks looks like this:
data: {"id":"","object":"","created":0,"model":"","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"choices":[],"usage":null}
data: {"id":"chatcmpl-XXXX","object":"chat.completion.chunk","created":1692979866,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"role":"assistant"},"content_filter_results":{}}],"usage":null}
data: {"id":"chatcmpl-YYY","object":"chat.completion.chunk","created":1692979866,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":"stop","delta":{},"content_filter_results":{}}],"usage":null}
data: [DONE]
- Process the response using the
ChatGPTInvocationLayer
class. - Observe errors due to the missing expected structure.
Expected behavior:
The _extract_token
method should gracefully handle scenarios where event_data["choices"]
doesn't follow the expected structure. If there's no data in a chunk or if the structure is different, the method should return an appropriate response or error message without crashing.
Code reference:
def _extract_token(self, event_data: Dict[str, Any]):
delta = event_data["choices"][0]["delta"]
if "content" in delta:
return delta["content"]
return None
Link to the line of _extract_token
Solution suggestion:
Consider adding checks to ensure event_data["choices"]
has the expected structure before accessing its items. On encountering a structure that deviates from the norm, the method could either provide a warning log and return a None value or manage the deviation in another user-friendly manner.
Looking forward to hearing your thoughts and insights on this!
Thank you for response!
Best regards, Andrej
Hi @azachar and thanks for the detailed report!
I can't prioritize this work right now, but I think the solution you suggest is a good one, trapping a KeyError
and returning None with a message around the format was not what we expected.
I'm putting this up for contributions wanted!
@masci I'd like to work on this issue. And just to be clear this is a simple case of modifying the function to check that the event_data has a certain structure and in the case it doesn't , just returning none and logging a warning?
Hey @masci , I would like to work on this issue, could you please assign this to me?