chainlit icon indicating copy to clipboard operation
chainlit copied to clipboard

Frontend/react-client does not resume session after backend connection loss

Open jbeckerdm opened this issue 1 year ago • 7 comments

Describe the bug We use a chainlit setup with chat_persistence and on_chat_resume. If we start a chat with one message, then cause a websocket disconnect by, for example, restarting the backend, and then write another message once the reconnect happened, both messages will be persisted in two different sessions. This can be reproduced with the integrated chainlit frontend.

To Reproduce Steps to reproduce the behavior:

  1. Start a chainlit backend on version ^1.0.301
  2. Open the frontend on localhost:8000
  3. Start a new chat by writing a message, say "message 1"
  4. Stop and restart the backend process
  5. Switch back to the frontend and wait for the "Could not reach the server" message to disappear.
  6. Write a new message, say "message 2"
  7. Reload the page and notice how there are two chats with one message each.

Expected behavior There should be one persisted chat session with both messages in it

Screenshots Screenshot 2024-03-19 at 15 35 24

Screenshot 2024-03-19 at 15 35 32

Desktop (please complete the following information):

  • OS: MacOS
  • Browser: Chrome
  • Version: 122

jbeckerdm avatar Mar 19 '24 14:03 jbeckerdm

I face this error too. This happens frequently for Cloud-hosted servers with auto-scaling.

I've traced the root cause to this:

  • When server restarts, frontend requests a new websocket session with X-Chainlit-Thread-Id header (code below). But this header is empty because idToResume is never set for new threads. https://github.com/Chainlit/chainlit/blob/d9fa5ba50eae0f20ce9d1a140998f9a9fb2ec92f/libs/react-client/src/useChatSession.ts#L90 idToResume is only set when a thread is resumed: https://github.com/Chainlit/chainlit/blob/d9fa5ba50eae0f20ce9d1a140998f9a9fb2ec92f/frontend/src/pages/ResumeButton.tsx#L29
  • Due to this, backend creates a new thread and ignore the existing messages completely (because the old thread is not loaded).

Possible solutions:

  • In order to fix this I think we need to set idToResume for new threads after first user interaction. Building on this PR https://github.com/Chainlit/chainlit/pull/923, idToResume should be set here along with currentThreadId: https://github.com/Chainlit/chainlit/blob/d9fa5ba50eae0f20ce9d1a140998f9a9fb2ec92f/libs/react-client/src/useChatSession.ts#L156
  • Alternatively, the code in useChatSession.ts can be updated to: 'X-Chainlit-Thread-Id': idToResume || currentThreadId || '',

@tpatel, what do you think? I can open a new PR for this fix.

qtangs avatar May 07 '24 04:05 qtangs

@qtangs, my analysis of the problem found the same root cause. However, I don't think your suggested solutions will work. The problem is that the X-Chainlit-Thread-Id value is given to the socket.io session when the session is created and cannot be updated later. So even if you use setIdToResume(threadId!); the session will still have the same values for the headers. When the server restarts, socket.io will retry the connection using the existing headers.

A possible solution would be to listen on the reconnect or reconnect_attempt event of socket.io and create a new session, if idToResume is not set.

qvalentin avatar May 14 '24 07:05 qvalentin

@qvalentin you're right. Using currentThreadId like that won't work as the initial value is undefined and the connection headers are not updated after the initial socket initialization.

I've tested this addition of the update after useChatSession.ts#L68 and verified that it works:

  const idToResume = useRecoilValue(threadIdToResumeState);
  const setCurrentThreadId = useSetRecoilState(currentThreadIdState);

  // Use currentThreadId as thread id in websocket header if it's set
  const currentThreadId = useRecoilValue(currentThreadIdState);

  useEffect(() => {
    if (session?.socket && currentThreadId) {
        session.socket.io.opts.extraHeaders!['X-Chainlit-Thread-Id'] = currentThreadId;
    }
  }, [currentThreadId]);

qtangs avatar May 14 '24 11:05 qtangs

Great, will you open a PR?

qvalentin avatar May 14 '24 11:05 qvalentin

Yeah, I'm planning to do that when time permits. Will need to add some test cases too

qtangs avatar May 14 '24 13:05 qtangs

Please submit a PR!

willydouhard avatar May 14 '24 13:05 willydouhard

Please submit a PR!

PR is created, pls review.

qtangs avatar May 15 '24 11:05 qtangs