Bug: undo causes unexpected paragraph deletion in collaborative editor
When using CollaborationPlugin and two users are editing the same paragraph, an undo from one user can cause Lexical and the Yjs doc to get out of sync, leading to the whole paragraph getting cleared on page refresh.
Lexical version: 0.17.1
In this screen recording, observe the behaviour after User 1 (left) clicks the undo button: it unexpectedly deletes a character from User 2's word, then after a refresh of the page the whole paragraph is cleared:
https://github.com/user-attachments/assets/3f631b41-99a2-4cd3-8402-2a9334e06877
When we replicate the issue in our own application we have detection code that shows Lexical editor state and the Yjs document are getting out of sync immediately after the user presses undo. Lexical thinks the second paragraph has the text node with User 2's text, while the Yjs doc thinks the second paragraph node is empty.
Steps To Reproduce
- Open two windows to https://playground.lexical.dev/?isCollab=true
- On the left, put the cursor at the start of an empty second paragraph. (The issue still presents in the first paragraph, but this demonstrates it's not unique to the start of the document.)
- On the left, type:
This is a,<backspace>,a test. - On the right, type:
Word - On the left, click the "Undo" button until all of
This is a test.is removed. Sometimes this happens after one undo, sometimes multiple undo's. - Refresh the page
The current behavior
The main unexpected behaviour is that after the undo operation, a refresh of the page causes the whole paragraph to be cleared, including the text from User 2.
In the specific case captured in the screen recording, it was also unexpected that User 1's undo operation deleted the W of Word from User 2.
The expected behavior
User 1's undo operation should undo only the text they typed, and a refresh of the page after the undo should not lead to the paragraph content getting cleared.
Impact of fix
This bug can cause loss/corruption of data for all users of the Lexical collaboration plugin.
Note that although the data corruption behaviour is usually the deleted paragraph as shown in the screen recording, in some cases we were able to reproduce the type of duplicated and garbled text seen in https://github.com/facebook/lexical/pull/6523 and https://github.com/facebook/lexical/pull/6374. We can't reliably reproduce this now, but it required both User 1 and User 2 performing edits on the affected paragraph right after pressing undo but before refreshing the page. We know at that point Lexical and Yjs are out of sync, so some combination of edits leads to further corruption.
Hi @ivailop7 @etrepum @StyleT @fantactuka, including you here as I saw you have recently worked on and/or reviewed PRs related to the collaboration plugin, but apologies if this is not relevant to you.
Wanted to check if you had any pointers on how we could look further into this issue? We're keen to put some effort into debugging and fixing but it stretches our understanding of the Yjs document data structures, syncLexicalUpdateToYjs, the Yjs UndoManager, etc. Any guidance to help accelerate us would be much appreciated!
The first thing I would check is to see if this happens if shouldBootstrap={false}. It requires a bit more orchestration to bootstrap the document server-side, but there are just inherent race conditions if two clients both think they can bootstrap the document when it is empty. I do have a pretty good understanding of CRDTs and distributed systems in general but the lexical/yjs implementation details are not something I'm deeply familiar with because I haven't used any of it in an application.
Thanks @etrepum, we have shouldBootstrap={false} in our own application and we can reproduce the same behaviour. I'll try get a repro using the Lexical playground, too.
Hi @etrepum just following up with a repro in split-screen mode of the playground where based on my understanding of the code, shouldBootstrap is true only for left-hand-side client. Slightly different from getting the server to bootstrap, but I think this should be sufficient to avoid any race conditions caused by bootstrapping?
In this split-screen approach we just reload one of the windows instead of refreshing the whole page:
https://github.com/user-attachments/assets/9e1b9f5c-9296-4c71-9dd2-c20f3764c825
We've also started seeing document corruption after the user does an 'undo' event and then the editor crashes following any input, I wonder if this is the same issue in question.
That's interesting @ivailop7 – never seen the editor crashing after the 'undo' event, but we have seen that further edits can lead to very garbled and duplicated text.
In case it's helpful for others to repro, I've added a test (PR) that runs through the steps in my screen recordings above.
Fair enough. Thanks for the PR!
Hi @ivailop7 @etrepum, checking in on this one – just wondering if you know if/when Lexical maintainers might be able to look more deeply into this bug? It'd be helpful to get any indication as we're aiming to roll out our collaborative editor for production usage, and this bug is introducing a doubt on readiness. Thanks in advance.
(We're also planning to dive fully into debugging, we just know for us it will require a bit of extra time to get up to speed on the Yjs data structures.)
In rare case when we do collaboration editing, we have some very strange behavior like the one on the screenshot where a line of text is duplicated tons of time.
Could it be related to this undo bug?
We cannot reproduce it but we put tons of log on server and everything seems good on the YJS side.
It seems that the undo break the lexical tree and in some cases, if I write a new letter, it creates a client side error 94.
Because splice function cannot find the node in the tree and the fallback make things copy/paste some paragraph instead adding the letter I typed.
@acemtp that's interesting to see! Does this happen with Lexical 0.17.1 and above?
We use 0.18.0.
Same bug (multiplication of texts and text loss in neighbouring paragraphs) as Vianney Lecroart happens to me on version 0.19.0. I can reproduce it quite often.
https://github.com/facebook/lexical/pull/6670 got merged yesterday. You can install the nightly version to check if things get better.
Sadly, problem is reproduced again under the scenario with an unique user accessing the editor, and collaboration active.
I will monitor closely the issue and try to collect more information.
@ebengtso that's unfortunate, thanks for confirming. It'd be great to hear more about the conditions / repro steps where you can trigger this problem. Is it only after using undo? And does the text look like that immediately after the undo command, or only after further edits following a problematic undo?
Hi @ebengtso, just wondering if you've managed to find any reliable repro steps for this bug?
We just had saw an instance of Error #94: splice: could not find collab element node in our application ([email protected]) but we aren't sure of the steps that led to it.
fwiw #6670 was released with 0.20.0 so if you're still using 0.19.0 then it doesn't have that fix which is possibly relevant
Very relevant! Thanks. We just realised this – we thought we had already picked up that release. Will upgrade and see if we still observe any of these errors.
@ebengtso Just checking in again to see if you're still having the issue with content duplication/corruption, and if so whether you have a better understanding of what causes the issue? We're still seeing it on 0.24.0, and are very keen to find and fix the root cause.
I was able to reproduce the issue again. I'm on version 0.24.0
The bug happened after doing the following:
copy a text from another internet page paste the text to the lexical editor edit the text, add headings, move things here and there make the lexical editor invisible. in my web site, lexical editor is in one panel, and I can switch to another panel turning the lexical editor invisible go back to the lexical editor the bug reproduces.
The sequence will not case the problem to repeats systematically
I will upgrade to the latest lexical.dev and keep an eye on it.
I've encountered the same bug where text gets duplicated. It consistently involves multiple instances of the error Error #94: splice: could not find collab element node. On the newer versions (tested on both 0.29.0 and 0.30.0), I also receive a warning of Invalid access: Add Yjs type to a document before reading data.
@gustason Do you have a set of steps that reliably reproduce the issue?
@james-atticus After cloning the repo and running a clean version of the playground (version 0.30.0) in dev, if I set ?isCollab=true, just about any update to the editor state triggers:
Invalid access: Add Yjs type to a document before reading data.
I believe the editor/state are updating correctly (in my own project I am able to persist the editor state normally through a websocket, though I haven't been able to resolve the warning)