haystack icon indicating copy to clipboard operation
haystack copied to clipboard

Handle incorrect responses from OpenAI in _extract_token method for stream:True

Open azachar opened this issue 1 year ago • 3 comments

Hello,

In the _extract_token method of the ChatGPTInvocationLayer class, there's an assumption that the event_data["choices"] array will always have at least one item with a delta key. However, based on observed behavior, OpenAI might provide responses without this expected structure on some preview endpoints. This can lead to unexpected errors.

Steps To Reproduce:

  1. Use an OpenAI endpoint that might return a response without the expected structure (e.g., my chunks looks like this:
data: {"id":"","object":"","created":0,"model":"","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"choices":[],"usage":null}

data: {"id":"chatcmpl-XXXX","object":"chat.completion.chunk","created":1692979866,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"role":"assistant"},"content_filter_results":{}}],"usage":null}

data: {"id":"chatcmpl-YYY","object":"chat.completion.chunk","created":1692979866,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":"stop","delta":{},"content_filter_results":{}}],"usage":null}

data: [DONE]
  1. Process the response using the ChatGPTInvocationLayer class.
  2. Observe errors due to the missing expected structure.

Expected behavior:
The _extract_token method should gracefully handle scenarios where event_data["choices"] doesn't follow the expected structure. If there's no data in a chunk or if the structure is different, the method should return an appropriate response or error message without crashing.

Code reference:

def _extract_token(self, event_data: Dict[str, Any]):
    delta = event_data["choices"][0]["delta"]
    if "content" in delta:
        return delta["content"]
    return None

Link to the line of _extract_token

Solution suggestion:
Consider adding checks to ensure event_data["choices"] has the expected structure before accessing its items. On encountering a structure that deviates from the norm, the method could either provide a warning log and return a None value or manage the deviation in another user-friendly manner.

Looking forward to hearing your thoughts and insights on this!

Thank you for response!

Best regards, Andrej

azachar avatar Aug 25 '23 16:08 azachar

Hi @azachar and thanks for the detailed report!

I can't prioritize this work right now, but I think the solution you suggest is a good one, trapping a KeyError and returning None with a message around the format was not what we expected.

I'm putting this up for contributions wanted!

masci avatar Oct 03 '23 08:10 masci

@masci I'd like to work on this issue. And just to be clear this is a simple case of modifying the function to check that the event_data has a certain structure and in the case it doesn't , just returning none and logging a warning?

ashutuptiwari avatar Oct 17 '23 14:10 ashutuptiwari

Hey @masci , I would like to work on this issue, could you please assign this to me?

adarsh-jha-dev avatar Oct 22 '23 17:10 adarsh-jha-dev