autogen Support for WebSockets, streaming responses to a frontend [Feature Request]:

Is your feature request related to a problem? Please describe.

I am using a Microsoft Teams App for a frontend, and hosting Autogen on my backend. When I send a message to Autogen from my frontend, Autogen does some processing and then sends back the final answer. This can take up to 2 minutes, which is too long for my customers. The vast majority of that time is taken up by GPT-4 generating an answer. I would like it if the final answer generated by Autogen could be streamed back to the frontend, using websockets or another protocol that is compatible with Microsoft Teams Apps. This would give my users something to look at immediately, rather than waiting a long time for the complete answer to pop up.

Describe the solution you'd like

An easy, plug-and-play solution for Microsoft Teams Apps, that allows Autogen to stream information to the Teams App.

Additional context

No response

Jan 10 '24 17:01 tyler-suard-parker

Hi @tyler-suard-parker , We are facing similar issue, there is a workaround though, you can try to do this.

    
    if sender.name == "INTERVIEWER" or sender.name =="candidate":
        print(f"hiii {sender.name}: {message.get('content')}")
        socket_io.emit(
        "message", {"sender": sender.name, "content": message.get("content")}
        )
    else:
    	print("Ignored")
        # pass

GroupChatManager._print_received_message = new_print_received_message

@app.route("/run", methods=['GET'])
def run():
    def new_get_human_input(self, prompt):
        reply = request.args.get("stock")
        print("debug reply line 81", reply)

        #reply = input(f'PATCHED{prompt}')
        return reply
        
    UserProxyAgent.get_human_input = new_get_human_input

    mychat =interviewer.initiate_chat(
        groupchat_manager, message="Hello, I am an AI MANAGER, my name is John Dalley, I will be conducting your interview today."
    )``` so basically you need to write a wrapper around "
GroupChatManager._print_received_message = new_print_received_message"

Jan 10 '24 17:01 Risingabhi

Just a suggestion: by adding "TERMINATE" at the beginning of the last message, rather than at the end of the last message, we could know which message is the final one, so that message could be streamed to the frontend.

Jan 10 '24 17:01 tyler-suard-parker

@Risingabhi that is an interesting approach, I agree, it might be a good idea to modify the print statement to send messages on the websockets instead. Thank you!

Jan 10 '24 17:01 tyler-suard-parker

can anyone share code for how to implement streaming for external application. can you explain briefly about your implementation @Risingabhi

Jan 14 '24 09:01 super-syan

@Risingabhi @tyler-suard-parker @super-syan we are refactoring and hoping to make it very easy to add streaming. See #1240 and please give your feedback and comments.

Jan 15 '24 02:01 ekzhu

@ekzhu I am not familiar with hooks, I was actually hoping for something more like SSE.

Jan 16 '24 22:01 tyler-suard-parker

What is SSE? As I mentioned in the other discussion #1290 , the goal of the refactoring effort is to modularize functionalities as middleware, so it becomes very easy to extend an agent with things like emitting a message to a frontend receiver.

Jan 16 '24 22:01 ekzhu

@ekzhu I am not familiar with hooks, I was actually hoping for something more like SSE.

@tyler-suard-parker the hooks got replaced by Middleware pattern and this seems like the best design pattern to use. For SSE, you basically need a generator that can be easily implemented using the Middleware pattern. We implement it as well while implementing other features.

Jan 16 '24 23:01 davorrunje

@ekzhu Sorry about that, SSE is server-sent events.

Jan 17 '24 01:01 tyler-suard-parker

@Risingabhi Could you give some more details about your implementation? Where are you declaring the server, client, etc?

Jan 18 '24 01:01 tyler-suard-parker

I just tried @Risingabhi 's implementation, it does not work with streaming, it just sends the complete message to a socket, not the chunks.

Jan 18 '24 20:01 tyler-suard-parker

I found a solution for this. I am using a singleton class in a file in the autogen/oai directory. I import that singleton class into my top script, and I store my websocket as a variable in that class. I also modify autogen/oai/client.py to import that same script and that same websocket, which I can then use to stream data from the _completions_create function in that file.

Jan 22 '24 17:01 tyler-suard-parker

I think we need to implement this in the framework for everyone else.

Jan 22 '24 19:01 davorrunje

@davorrunje I agree and I would be happy to do it, but my use case is very specific. I am running on Azure Web Apps and returning a stream to an Azure Teams App. Also I am not quite sure how to disable the streaming to a UI if it is not needed.

Jan 22 '24 19:01 tyler-suard-parker

Let's keep it open and I think I might take it soon. Any help would be appreciated though :)

Jan 22 '24 19:01 davorrunje

Hi I have implemented streaming for my app checkout my repository you can disable and enable using use_socket Boolean. Refer the simple _chat.py. You have to pass the socket server instance as a call back. It still lacks some of the checks.but it is usable....

Jan 22 '24 19:01 GokulrajKS

I am currently trying to work with react frontend. I already tried it with panel. It worked fine. Will let you know soon.

On Thu, 18 Jan, 2024, 6:31 am tyler-suard-parker, @.***> wrote:

@Risingabhi https://github.com/Risingabhi Could you give some more details about your implementation? Where are you declaring the server, client, etc?

— Reply to this email directly, view it on GitHub https://github.com/microsoft/autogen/issues/1199#issuecomment-1897580732, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEH5EVWAVK6QGN7QX6HS5ULYPBX65AVCNFSM6AAAAABBVFUTIKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJXGU4DANZTGI . You are receiving this because you were mentioned.Message ID: @.***>

Jan 24 '24 19:01 Risingabhi

@Risingabhi Let us know any news

Jan 25 '24 19:01 ahernandezq

All, I finally bit the bullet and figured out how to do this, I just instantiated a websocket at the top level, and then passed that down as a parameter through every function between the top level and the _create_completion function in oai/client.py. It works!

Jan 25 '24 19:01 tyler-suard-parker

Awesome, thanks for the update!

On Fri, 26 Jan, 2024, 12:37 am tyler-suard-parker, @.***> wrote:

All, I finally bit the bullet and figured out how to do this, I just instantiated a websocket at the top level, and then passed that down as a parameter through every function between the top level and the _create_completion function in oai/client.py. It works!

— Reply to this email directly, view it on GitHub https://github.com/microsoft/autogen/issues/1199#issuecomment-1910817578, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEH5EVREIEFRK56MGLOM6D3YQKUQLAVCNFSM6AAAAABBVFUTIKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJQHAYTONJXHA . You are receiving this because you were mentioned.Message ID: @.***>

Jan 25 '24 21:01 Risingabhi

autogen autogen copied to clipboard

Support for WebSockets, streaming responses to a frontend [Feature Request]:

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Additional context

autogen
autogen copied to clipboard