lit icon indicating copy to clipboard operation
lit copied to clipboard

feat(labs/ssr): use RegExp.exec to escape HTML

Open 43081j opened this issue 2 months ago • 3 comments

Switches from using replace(pattern, fn) to using the exec method of a RegExp instead.

This is draft until someone can decide if its a sensible thing to do 😬

basically, i noticed one of the slow downs compared to other libraries in SSR was this function. it seems replace(pattern, fn) is much slower than manually iterating over matches.

I've tried 3 different algorithms and loosely benchmarked them.

Algorithm 1 (current)

  1. Return str.replace(pattern, fn) where fn returns the replacement character

Algorithm 2 (this PR)

  1. Match against the string with pattern.exec(str)
  2. If no match, return early (no special chars anywhere)
  3. Otherwise, consume the preceding text and the escaped character
  4. exec the pattern again and repeat this process until there are no matches
  5. If the last match wasn't the end of the string, consume the remaining text of the string
  6. Return the resulting new string

Algorithm 3 (React)

  1. Match against the string with pattern.exec(str)
  2. If no match, return early (no special chars anywhere)
  3. Otherwise, consume the preceding text and the escaped character
  4. Iterate through the remaining characters until we reach another special character (not using regex)
  5. Repeat from step 3
  6. Return the resulting new string once we have reached the end

Performance

I loosely benchmarked it using 3 strings:

  • "foo <div> bar bar </div> baz"
  • "foo bar baz"
  • "foo <div> some very long string which does not contain any more special chars ... "

Algorithm 3:

  • Particularly terrible with the 3rd string, since it has to iterate through all the remaining characters for no reason whereas algo 2 would've already stopped long ago (since exec would've returned null)
  • Fastest at shorter strings which contain many special chars

Algorithm 2:

  • Fastest overall

Algorithm 1:

  • Slowest overall

The early return probably helps a lot since many strings do not contain special chars.

43081j avatar Apr 17 '24 23:04 43081j