autogen icon indicating copy to clipboard operation
autogen copied to clipboard

How can I stream to a frontend using SSE?

Open tyler-suard-parker opened this issue 1 year ago • 1 comments

Describe the issue

I would like to host Autogen on Flask, end then stream the responses to a frontend using SSE. I am not sure how to do that. SSE can return streaming messages if it uses a generator, but I am not sure how to get the code out from the deep innards of Autogen (client.py, conversable_agent.py) and bring it up to the top so Flask can return that stream. Any help on this would be much appreciated. @ragyabraham

Steps to reproduce

No response

Screenshots and logs

No response

Additional Information

No response

tyler-suard-parker avatar Jan 16 '24 22:01 tyler-suard-parker

Thanks for the issue! Could you expand a little bit on what SSE is?

ekzhu avatar Jan 16 '24 22:01 ekzhu

Yes, I'm sorry about that, SSE stands for server-sent events, which is, from what I understand, how ChatGPT streams information to its UI.

tyler-suard-parker avatar Jan 17 '24 01:01 tyler-suard-parker

Got it. Check if my understanding is correct: on the server side (Python) you would have to create a data stream as part of the response object with mime type "text/event-stream"? So, something similar to:

@app.route("/stream")
def stream():
    def generate():
        while True:
            yield "data: {}\n\n".format("Hello, World!")
    return Response(generate(), mimetype="text/event-stream")

The generate() function is a generator that when called, produces a stream of string and the front end can receive from. While this generator object is managed by the Flask framework itself.

So the only thing we need to produce from the autogen side is a Python generator that runs in its own thread listening to a concurrent pipe and yields data from that pipe.

I think this can be achieved by registering a middleware (#1240) to the agent you want to stream response from, and every time you catch its response in the middleware you then asynchronously writes the response to the concurrent pipe the generator is listening to. This way your Flaks code you can call the agent and directly returns the generator in your Response object. Of course you need to close the concurrent pipe in the middleware once you have done with it.

ekzhu avatar Jan 17 '24 20:01 ekzhu