mlx-vlm Add sequence number handling to streaming events in OpenAI endpoint

Introduced a SequenceNumber class to manage and increment sequence numbers for streaming events.
Updated various event responses in the openai_endpoint to include the sequence number, ensuring compliance with the OpenAI pipeline.
Enhanced event data consistency by incorporating sequence numbers in response events such as response.created, response.in_progress, and others.

Sep 04 '25 16:09 dwohlfahrt

Thanks @dwohlfahrt!

Could you share a bit more about this sequence number? And why is it important?

Oct 03 '25 12:10 Blaizzy

Hey @Blaizzy sequence_number is part of the OpenAI Responses API spec for streaming responses. See here: https://platform.openai.com/docs/api-reference/responses-streaming/response/created#responses-streaming/response/created-sequence_number. So when trying to consume the this endpoint with streaming responses using the OpenAI Python SDK, it throws a schema validation error since the streaming responses don't contain the sequence_numberfield.

Oct 03 '25 19:10 dwohlfahrt