langserve
langserve copied to clipboard
Streaming doesn't seem to work properly when using an AzureChatOpenAI model.
Hello,
I'm unable to get an SSE streaming endpoint working correctly when using an Azure-hosted OpenAI model (using the AzureChatOpenAI class).
I'm using a simple LCEL chain:
chain = promptTemplate | model | parser
When using a AzureChatOpenAI model, the SSE events are not streamed as the tokens are generated. Instead, it appears that I receive all of the events at once after all the tokens are generated.
However, if I replace the AzureChatOpenAI model with a ChatOpenAI model (using the same prompt, function bindings, etc.), the stream DOES work as intended, returning SSE events in real-time as the tokens are generated.
So I believe I've isolated the problem down to that particular AzureChatOpenAI model.
Is this an issue with langserve or langchain? Does AzureChatOpenAI not support streaming in a LCEL?
Any insight or workarounds would be appreciated.
Thanks!
@mlamothe-zz thanks for reporting! Which parser are you using?
Thanks, Eugene.
I'm using langchain.output_parsers.openai_functions.JsonOutputFunctionsParser
I suspect that this is not a langserve issue, but some bug in the streaming version of the parser.
Could you confirm whether this chain (without the server) streams for you?
chain = promptTemplate | model | parser
for chunk in chain.stream(..):
print(chunk)
So, streaming the chain directly (without the server) exhibits the same behaviour I saw above. Using the AzureChatOpenAI model does not stream properly, but the ChatOpenAI model does.
I think you're right -- the issue doesn't sit with langserve.
Shall I file this with the langchain team?
OK great this is in the JSON parser then