rrweb perf(rrweb): optimize random shuffled addList

ref: https://github.com/rrweb-io/rrweb/pull/1300

Sep 09 '23 15:09 wfk007

⚠️ No Changeset found

Latest commit: b002f0e6431557e43f08bdcc48bd6a320682aa64

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Sep 09 '23 15:09 changeset-bot[bot]

@wfk007 it unfortunately looks like there are some performance impacts for this change in certain situations. I recently encountered a slow mutation with an old version of the jstree library. Here's a plunker that demonstrates the issue. There's 2 rrweb builds in there, one with the "shuffle fix" and one without.

Here's the profile for the "without" build:

Here's the profile for the "with" build:

Looks like this particular mutation spends a lot more time in that while(addList) loop with the fix applied, but I'm not sure why.

Sidenote: by applying a couple more fixes that are in PR limbo, I got the processing time for the mutation in the plunker down to just over 100ms. Some of the proprietary recording solutions out there can handle it in 50ms...

Oct 27 '23 19:10 mdellanoce

@mdellanoce I considered that this PR was not the optimal solution, and the implementation was not very elegant, so I changed it to a draft. When I am not busy later, I can also take a look at this case. Thanks for your feedback.

Oct 30 '23 10:10 wfk007

another possibility for a fix:

diff --git a/packages/rrweb/src/record/mutation.ts b/packages/rrweb/src/record/mutation.ts
index 097d1a8..cdceaf5 100644
--- a/packages/rrweb/src/record/mutation.ts
+++ b/packages/rrweb/src/record/mutation.ts
@@ -425,7 +425,13 @@ export default class MutationBuffer {
         }
         break;
       }
-      candidate = node.previous;
+      if (node.value.previousSibling && isNodeInLinkedList(node.value.previousSibling)) {
+        candidate = node.value.previousSibling.__ln;
+      } else if (node.value.parentNode && isNodeInLinkedList(node.value.parentNode)) {
+        candidate = node.value.parentNode.__ln;
+      } else {
+        candidate = node.previous;
+      }
       addList.removeNode(node.value);
       pushAdd(node.value);
     }

Tries a little harder to find nodes that might have a serialized parent/sibling. Fixes the shuffle issue and performs well for the jstree issue as well. Unfortunately, it does slow down the "create 1000 DOM nodes and append into its previous looped node" benchmark. Feels like the worst case just keeps changing depending on the fix. The original "defragment" change in #1300 is the most general purpose I've found so far, but obviously has the downside of slowing down the average case.

Oct 31 '23 16:10 mdellanoce

@mdellanoce I know why "without shuffle fix" is faster than "with".

The above Dom structure can be created in many ways.

Add div1 as a whole to body

const div1 = document.createElement('div');
div1.setAttribute('id', 'div1')
const div2 = document.createElement('div');
div2.setAttribute('id', 'div2')
        
div1.append(div2)
document.body.append(div1)

Add div1 to body and add div2 to div1 separately

const div1 = document.createElement('div');
div1.setAttribute('id', 'div1')
document.body.append(div1)

const div2 = document.createElement('div');
div2.setAttribute('id', 'div2')
div1.append(div2)

These two approach will generate different MutationObserver by browser

means: div add to body

[
  {
    target: body,
    addNodes: [div1],
    type: "childList"
  }
]

means: div1 add to body and div2 add to div1

[
  {
    target: body,
    addNodes: [div1],
    type: "childList"
  },
  {
    target: div1,
    addNodes: [div2],
    type: "childList"
  },
]

In order to comprehensively consider the above two cases, when looping addNodes, we will travel all its descendant children. It means that in the second case, we will read div2 twice(one in index===0 and another in index===1), and the right order will be broken in the second loop.

The order will directly affect the performance in the emit phase.

jstree generates the MutationObserver similar to the second case.

What is currently known: The order of nodes in addedSet is important and will affect the performance of the emit phase.

We have two thinking directions to optimize this problem:

Guarantee the order of addedSet nodes: This is difficult, and various boundary cases need to be considered
Ensure the performance of emit even if addedSet is not in the right order: It is possible to increase the average time consumption when optimizing the performance of boundary cases.

This is a question worth thinking about!

Nov 06 '23 12:11 wfk007

Ensure the performance of emit even if addedSet is not in the right order: It is possible to increase the average time consumption when optimizing the performance of boundary cases.

@wfk007 I've been (slowly) working at it from this angle. I'm trying to ensure linear time processing for added nodes on both the record and replay sides. I hope to have something to share soon.

Nov 06 '23 22:11 mdellanoce

i updated my PR #1300 with another potential solution

Nov 13 '23 19:11 mdellanoce

i updated #1300 with a benchmark for the 2nd style of mutation mentioned above

Dec 12 '23 18:12 mdellanoce