mlx-vlm
mlx-vlm copied to clipboard
Add sequence number handling to streaming events in OpenAI endpoint
- Introduced a SequenceNumber class to manage and increment sequence numbers for streaming events.
- Updated various event responses in the openai_endpoint to include the sequence number, ensuring compliance with the OpenAI pipeline.
- Enhanced event data consistency by incorporating sequence numbers in response events such as response.created, response.in_progress, and others.
Thanks @dwohlfahrt!
Could you share a bit more about this sequence number? And why is it important?
Hey @Blaizzy sequence_number is part of the OpenAI Responses API spec for streaming responses. See here: https://platform.openai.com/docs/api-reference/responses-streaming/response/created#responses-streaming/response/created-sequence_number. So when trying to consume the this endpoint with streaming responses using the OpenAI Python SDK, it throws a schema validation error since the streaming responses don't contain the sequence_numberfield.