autogen icon indicating copy to clipboard operation
autogen copied to clipboard

Support for WebSockets, streaming responses to a frontend [Feature Request]:

Open tyler-suard-parker opened this issue 1 year ago • 10 comments

Is your feature request related to a problem? Please describe.

I am using a Microsoft Teams App for a frontend, and hosting Autogen on my backend. When I send a message to Autogen from my frontend, Autogen does some processing and then sends back the final answer. This can take up to 2 minutes, which is too long for my customers. The vast majority of that time is taken up by GPT-4 generating an answer. I would like it if the final answer generated by Autogen could be streamed back to the frontend, using websockets or another protocol that is compatible with Microsoft Teams Apps. This would give my users something to look at immediately, rather than waiting a long time for the complete answer to pop up.

Describe the solution you'd like

An easy, plug-and-play solution for Microsoft Teams Apps, that allows Autogen to stream information to the Teams App.

Additional context

No response

tyler-suard-parker avatar Jan 10 '24 17:01 tyler-suard-parker

Hi @tyler-suard-parker , We are facing similar issue, there is a workaround though, you can try to do this.

    
    if sender.name == "INTERVIEWER" or sender.name =="candidate":
        print(f"hiii {sender.name}: {message.get('content')}")
        socket_io.emit(
        "message", {"sender": sender.name, "content": message.get("content")}
        )
    else:
    	print("Ignored")
        # pass
GroupChatManager._print_received_message = new_print_received_message

@app.route("/run", methods=['GET'])
def run():
    def new_get_human_input(self, prompt):
        reply = request.args.get("stock")
        print("debug reply line 81", reply)

        #reply = input(f'PATCHED{prompt}')
        return reply
        
    UserProxyAgent.get_human_input = new_get_human_input

    mychat =interviewer.initiate_chat(
        groupchat_manager, message="Hello, I am an AI MANAGER, my name is John Dalley, I will be conducting your interview today."
    )``` so basically you need to write a wrapper around "
GroupChatManager._print_received_message = new_print_received_message"

Risingabhi avatar Jan 10 '24 17:01 Risingabhi

Just a suggestion: by adding "TERMINATE" at the beginning of the last message, rather than at the end of the last message, we could know which message is the final one, so that message could be streamed to the frontend.

tyler-suard-parker avatar Jan 10 '24 17:01 tyler-suard-parker

@Risingabhi that is an interesting approach, I agree, it might be a good idea to modify the print statement to send messages on the websockets instead. Thank you!

tyler-suard-parker avatar Jan 10 '24 17:01 tyler-suard-parker

can anyone share code for how to implement streaming for external application. can you explain briefly about your implementation @Risingabhi

super-syan avatar Jan 14 '24 09:01 super-syan

@Risingabhi @tyler-suard-parker @super-syan we are refactoring and hoping to make it very easy to add streaming. See #1240 and please give your feedback and comments.

ekzhu avatar Jan 15 '24 02:01 ekzhu

@ekzhu I am not familiar with hooks, I was actually hoping for something more like SSE.

tyler-suard-parker avatar Jan 16 '24 22:01 tyler-suard-parker

What is SSE? As I mentioned in the other discussion #1290 , the goal of the refactoring effort is to modularize functionalities as middleware, so it becomes very easy to extend an agent with things like emitting a message to a frontend receiver.

ekzhu avatar Jan 16 '24 22:01 ekzhu

@ekzhu I am not familiar with hooks, I was actually hoping for something more like SSE.

@tyler-suard-parker the hooks got replaced by Middleware pattern and this seems like the best design pattern to use. For SSE, you basically need a generator that can be easily implemented using the Middleware pattern. We implement it as well while implementing other features.

davorrunje avatar Jan 16 '24 23:01 davorrunje

@ekzhu Sorry about that, SSE is server-sent events.

tyler-suard-parker avatar Jan 17 '24 01:01 tyler-suard-parker

@Risingabhi Could you give some more details about your implementation? Where are you declaring the server, client, etc?

tyler-suard-parker avatar Jan 18 '24 01:01 tyler-suard-parker

I just tried @Risingabhi 's implementation, it does not work with streaming, it just sends the complete message to a socket, not the chunks.

tyler-suard-parker avatar Jan 18 '24 20:01 tyler-suard-parker

I found a solution for this. I am using a singleton class in a file in the autogen/oai directory. I import that singleton class into my top script, and I store my websocket as a variable in that class. I also modify autogen/oai/client.py to import that same script and that same websocket, which I can then use to stream data from the _completions_create function in that file.

tyler-suard-parker avatar Jan 22 '24 17:01 tyler-suard-parker

I think we need to implement this in the framework for everyone else.

davorrunje avatar Jan 22 '24 19:01 davorrunje

@davorrunje I agree and I would be happy to do it, but my use case is very specific. I am running on Azure Web Apps and returning a stream to an Azure Teams App. Also I am not quite sure how to disable the streaming to a UI if it is not needed.

tyler-suard-parker avatar Jan 22 '24 19:01 tyler-suard-parker

Let's keep it open and I think I might take it soon. Any help would be appreciated though :)

davorrunje avatar Jan 22 '24 19:01 davorrunje

Hi I have implemented streaming for my app checkout my repository you can disable and enable using use_socket Boolean. Refer the simple _chat.py. You have to pass the socket server instance as a call back. It still lacks some of the checks.but it is usable....

GokulrajKS avatar Jan 22 '24 19:01 GokulrajKS

I am currently trying to work with react frontend. I already tried it with panel. It worked fine. Will let you know soon.

On Thu, 18 Jan, 2024, 6:31 am tyler-suard-parker, @.***> wrote:

@Risingabhi https://github.com/Risingabhi Could you give some more details about your implementation? Where are you declaring the server, client, etc?

— Reply to this email directly, view it on GitHub https://github.com/microsoft/autogen/issues/1199#issuecomment-1897580732, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEH5EVWAVK6QGN7QX6HS5ULYPBX65AVCNFSM6AAAAABBVFUTIKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJXGU4DANZTGI . You are receiving this because you were mentioned.Message ID: @.***>

Risingabhi avatar Jan 24 '24 19:01 Risingabhi

@Risingabhi Let us know any news

ahernandezq avatar Jan 25 '24 19:01 ahernandezq

All, I finally bit the bullet and figured out how to do this, I just instantiated a websocket at the top level, and then passed that down as a parameter through every function between the top level and the _create_completion function in oai/client.py. It works!

tyler-suard-parker avatar Jan 25 '24 19:01 tyler-suard-parker

Awesome, thanks for the update!

On Fri, 26 Jan, 2024, 12:37 am tyler-suard-parker, @.***> wrote:

All, I finally bit the bullet and figured out how to do this, I just instantiated a websocket at the top level, and then passed that down as a parameter through every function between the top level and the _create_completion function in oai/client.py. It works!

— Reply to this email directly, view it on GitHub https://github.com/microsoft/autogen/issues/1199#issuecomment-1910817578, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEH5EVREIEFRK56MGLOM6D3YQKUQLAVCNFSM6AAAAABBVFUTIKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJQHAYTONJXHA . You are receiving this because you were mentioned.Message ID: @.***>

Risingabhi avatar Jan 25 '24 21:01 Risingabhi