rrweb icon indicating copy to clipboard operation
rrweb copied to clipboard

perf(rrweb): optimize random shuffled addList

Open wfk007 opened this issue 2 years ago • 9 comments

ref: https://github.com/rrweb-io/rrweb/pull/1300

wfk007 avatar Sep 09 '23 15:09 wfk007

⚠️ No Changeset found

Latest commit: b002f0e6431557e43f08bdcc48bd6a320682aa64

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

changeset-bot[bot] avatar Sep 09 '23 15:09 changeset-bot[bot]

@wfk007 it unfortunately looks like there are some performance impacts for this change in certain situations. I recently encountered a slow mutation with an old version of the jstree library. Here's a plunker that demonstrates the issue. There's 2 rrweb builds in there, one with the "shuffle fix" and one without.

Here's the profile for the "without" build: image

Here's the profile for the "with" build: image

Looks like this particular mutation spends a lot more time in that while(addList) loop with the fix applied, but I'm not sure why.

Sidenote: by applying a couple more fixes that are in PR limbo, I got the processing time for the mutation in the plunker down to just over 100ms. Some of the proprietary recording solutions out there can handle it in 50ms...

mdellanoce avatar Oct 27 '23 19:10 mdellanoce

@mdellanoce I considered that this PR was not the optimal solution, and the implementation was not very elegant, so I changed it to a draft. When I am not busy later, I can also take a look at this case. Thanks for your feedback.

wfk007 avatar Oct 30 '23 10:10 wfk007

another possibility for a fix:

diff --git a/packages/rrweb/src/record/mutation.ts b/packages/rrweb/src/record/mutation.ts
index 097d1a8..cdceaf5 100644
--- a/packages/rrweb/src/record/mutation.ts
+++ b/packages/rrweb/src/record/mutation.ts
@@ -425,7 +425,13 @@ export default class MutationBuffer {
         }
         break;
       }
-      candidate = node.previous;
+      if (node.value.previousSibling && isNodeInLinkedList(node.value.previousSibling)) {
+        candidate = node.value.previousSibling.__ln;
+      } else if (node.value.parentNode && isNodeInLinkedList(node.value.parentNode)) {
+        candidate = node.value.parentNode.__ln;
+      } else {
+        candidate = node.previous;
+      }
       addList.removeNode(node.value);
       pushAdd(node.value);
     }

Tries a little harder to find nodes that might have a serialized parent/sibling. Fixes the shuffle issue and performs well for the jstree issue as well. Unfortunately, it does slow down the "create 1000 DOM nodes and append into its previous looped node" benchmark. Feels like the worst case just keeps changing depending on the fix. The original "defragment" change in #1300 is the most general purpose I've found so far, but obviously has the downside of slowing down the average case.

mdellanoce avatar Oct 31 '23 16:10 mdellanoce

@mdellanoce I know why "without shuffle fix" is faster than "with".

image

The above Dom structure can be created in many ways.

  1. Add div1 as a whole to body
const div1 = document.createElement('div');
div1.setAttribute('id', 'div1')
const div2 = document.createElement('div');
div2.setAttribute('id', 'div2')
        
div1.append(div2)
document.body.append(div1)
  1. Add div1 to body and add div2 to div1 separately
const div1 = document.createElement('div');
div1.setAttribute('id', 'div1')
document.body.append(div1)

const div2 = document.createElement('div');
div2.setAttribute('id', 'div2')
div1.append(div2)

These two approach will generate different MutationObserver by browser

  1. means: div add to body
[
  {
    target: body,
    addNodes: [div1],
    type: "childList"
  }
]
  1. means: div1 add to body and div2 add to div1
[
  {
    target: body,
    addNodes: [div1],
    type: "childList"
  },
  {
    target: div1,
    addNodes: [div2],
    type: "childList"
  },
]

In order to comprehensively consider the above two cases, when looping addNodes, we will travel all its descendant children. It means that in the second case, we will read div2 twice(one in index===0 and another in index===1), and the right order will be broken in the second loop.

The order will directly affect the performance in the emit phase.

jstree generates the MutationObserver similar to the second case.

What is currently known: The order of nodes in addedSet is important and will affect the performance of the emit phase.

We have two thinking directions to optimize this problem:

  1. Guarantee the order of addedSet nodes: This is difficult, and various boundary cases need to be considered
  2. Ensure the performance of emit even if addedSet is not in the right order: It is possible to increase the average time consumption when optimizing the performance of boundary cases.

This is a question worth thinking about!

wfk007 avatar Nov 06 '23 12:11 wfk007

Ensure the performance of emit even if addedSet is not in the right order: It is possible to increase the average time consumption when optimizing the performance of boundary cases.

@wfk007 I've been (slowly) working at it from this angle. I'm trying to ensure linear time processing for added nodes on both the record and replay sides. I hope to have something to share soon.

mdellanoce avatar Nov 06 '23 22:11 mdellanoce

i updated my PR #1300 with another potential solution

mdellanoce avatar Nov 13 '23 19:11 mdellanoce

i updated #1300 with a benchmark for the 2nd style of mutation mentioned above

mdellanoce avatar Dec 12 '23 18:12 mdellanoce