feat: websocket connection management and sandbox bound to session.
this PR includes:
- FE support reconnecting the WS after closing or refreshing the page.
- add /auth to get a JWT token for the server identifies the client, mainly use the session for now.
- the server doesn't restart the sandbox every time when the session is init, so reuse the previous container based on the session id.
I had a bit of trouble using this. I refrsehed the page mid-task, and it caused the server to crash with:
AgentFinishAction(action='finish')
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/uvicorn/protocols/websockets/websockets_impl.py", line 240, in run_asgi
result = await self.app(self.scope, self.asgi_receive, self.asgi_send)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in __call__
return await self.app(scope, receive, send)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
await super().__call__(scope, receive, send)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/applications.py", line 123, in __call__
await self.middleware_stack(scope, receive, send)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/middleware/errors.py", line 151, in __call__
await self.app(scope, receive, send)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/middleware/cors.py", line 75, in __call__
await self.app(scope, receive, send)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/routing.py", line 758, in __call__
await self.middleware_stack(scope, receive, send)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/routing.py", line 778, in app
await route.handle(scope, receive, send)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/routing.py", line 375, in handle
await self.app(scope, receive, send)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/routing.py", line 98, in app
await wrap_app_handling_exceptions(app, session)(scope, receive, send)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/routing.py", line 96, in app
await func(session)
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/fastapi/routing.py", line 348, in app
await dependant.call(**values)
File "/home/rbren/git/opendevin/opendevin/server/listen.py", line 35, in websocket_endpoint
await session.start_listening()
File "/home/rbren/git/opendevin/opendevin/server/session.py", line 107, in start_listening
data = await self.websocket.receive_json()
File "/home/rbren/.local/share/virtualenvs/opendevin-0fQgowZe/lib/python3.10/site-packages/starlette/websockets.py", line 135, in receive_json
raise RuntimeError(
RuntimeError: WebSocket is not connected. Need to call "accept" first.
Here's the behavior I'd hope for:
- If at any point, I refresh the page, all the state pops back into place. E.g. my message history is all there, the command-line state, etc. The current task is still running and outputting messages.
- do we just have the server send the entire history as messages?
- if we do, how do we stop them all from printing out slowly (e.g. do we override the typewriter functionality?)
- if the websocket disconnects, e.g. due to a bad internet connection, it picks back up seamlessly
- this might be hard--the server would need to keep track of which items in the history had been sent successfully
- As a more near-term goal, losing the history but seeing the task still in-progress would be nice.
okay, for these goals, i need to decouple the WS connection management and Agent controller.
- every time the agent sends messages to FE, it will get the latest WS conn and save msg to the msg stack, if success will mark the msg.
- FE will also save msg to the msg stack. if the page is refreshed, FE can send the latest msg id to BE to get all msgs after that.
like this
wdyt?
btw, i think the Agent needs to be stopped if the client disconnects for a specified period of time. resume when the client reconnects?
btw, i think the Agent needs to be stopped if the client disconnects for a specified period of time. resume when the client reconnects?
This seems like a good feature, but maybe a follow-on. We don't have the ability to pause and resume the agent controller loop just yet
Overall plan looks great to me though!
https://github.com/OpenDevin/OpenDevin/assets/16201837/48b3bfc2-5b4f-4d55-986f-58de7b997758
The screen recording is above. modified includes:
- abstract session manager and agent manager, origin from session.py
- cache sessions and messages in local(./cache) in server quit.
- the terminal in FE gets messages from the store.
- refine the socket module to auto-reconnect.
- add the warning to the user, and let the user decide to load the previous session. (it can be improved in the future for multi-projects/panels)
there is an obvious problem with asyncio. i can't solve it cuz i don't very familiar with asyncio. the error is below:
^CReceived signal 2, exiting...
ERROR: Traceback (most recent call last):
File "/opt/homebrew/Caskroom/miniconda/base/envs/OpenDevin/lib/python3.12/asyncio/runners.py", line 194, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/OpenDevin/lib/python3.12/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "uvloop/loop.pyx", line 1511, in uvloop.loop.Loop.run_until_complete
File "uvloop/loop.pyx", line 1504, in uvloop.loop.Loop.run_until_complete
File "uvloop/loop.pyx", line 1377, in uvloop.loop.Loop.run_forever
File "uvloop/loop.pyx", line 555, in uvloop.loop.Loop._run
File "uvloop/handles/poll.pyx", line 216, in uvloop.loop.__on_uvpoll_event
File "uvloop/cbhandles.pyx", line 83, in uvloop.loop.Handle._run
File "uvloop/cbhandles.pyx", line 66, in uvloop.loop.Handle._run
File "uvloop/loop.pyx", line 397, in uvloop.loop.Loop._read_from_self
File "uvloop/loop.pyx", line 402, in uvloop.loop.Loop._invoke_signals
File "uvloop/loop.pyx", line 377, in uvloop.loop.Loop._ceval_process_signals
File "/Users/ifuryst/projects/ai/OpenDevin/opendevin/server/session/manager.py", line 45, in handle_signal
exit(0)
File "<frozen _sitebuiltins>", line 26, in __call__
SystemExit: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/homebrew/Caskroom/miniconda/base/envs/OpenDevin/lib/python3.12/site-packages/starlette/routing.py", line 743, in lifespan
await receive()
File "/opt/homebrew/Caskroom/miniconda/base/envs/OpenDevin/lib/python3.12/site-packages/uvicorn/lifespan/on.py", line 137, in receive
return await self.receive_queue.get()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/OpenDevin/lib/python3.12/asyncio/queues.py", line 158, in get
await getter
asyncio.exceptions.CancelledError
just start the server and Ctrl+C to trigger. it seems not fatal except too annoying ..
This is looking awesome!
Sorry for all the merge conflicts π¬ will try and get this one in once it's rebased
This is looking awesome!
Sorry for all the merge conflicts π¬ will try and get this one in once it's rebased
okay, let me resolve the conflicts.
On a fresh install, I'm seeing empty initialize events sent from the FE. This is causing the server to crash with
File "/home/rbren/git/opendevin/opendevin/server/session/manager.py", line 37, in loop_recv
await self._sessions[sid].loop_recv(dispatch)
File "/home/rbren/git/opendevin/opendevin/server/session/session.py", line 33, in loop_recv
await dispatch(action, data)
File "/home/rbren/git/opendevin/opendevin/server/agent/manager.py", line 74, in dispatch
await self.create_controller(data)
File "/home/rbren/git/opendevin/opendevin/server/agent/manager.py", line 116, in create_controller
os.makedirs(directory)
File "<frozen os>", line 225, in makedirs
FileNotFoundError: [Errno 2] No such file or directory: ''
Are you able to repro? running localStorage.clear() might help
i repro it, i'm fixing it.
be fixed.
Sorry looks like another rough merge π¬
solve it laterπΆβπ«οΈ
caught up with the main branch
Testing now!
Tested this out. There are some edge cases, but reconnection works perfectly if I turn my wifi off mid-session.
This is an awesome improvement. Let's get it in!
This may have introduced a regression by falling back to "gpt-3.5-turbo-1106" model (if one not given in the UI). The regression would be that the backend no longer respects the paramters in config.toml that allow Ollama to work
Error condensing thoughts: No healthy deployment available, passed model=gpt-3.5-turbo-1106
π thanks for filing an issue!