svelte
svelte copied to clipboard
Slow parsing when blanks are continuous
Describe the bug
If there are many consecutive blanks, it will be very slow to parse.
I run the following code and found that it took 9000ms.
import { parse } from "svelte/compiler";
const fixture = `<div>${" ".repeat(100000)}</div>`;
const start = Date.now();
parse(fixture);
console.log(Date.now() - start);
Reproduction
- Open
https://stackblitz.com/edit/node-4tso3q?file=index.js - Run
npm test
Running the following code that uses the svelte parser takes 9000ms.
import { parse } from "svelte/compiler";
const fixture = `<div>${" ".repeat(100000)}</div>`;
const start = Date.now();
parse(fixture);
console.log(Date.now() - start);
However, the code for the following template, which has more characters than before, finishes in 50ms.
import { parse } from "svelte/compiler";
const fixture = `<div>${" A".repeat(100000)}</div>`;
const start = Date.now();
parse(fixture);
console.log(Date.now() - start);
I think most of the time for template parses is spent on the next regex.
https://github.com/sveltejs/svelte/blob/ccbf10c16326cffb1b38e721e2f74aad4b85dbf2/src/compiler/parse/index.ts#L37
Logs
No response
System Info
System:
OS: macOS 12.4
CPU: (4) x64 Intel(R) Core(TM) i5-6360U CPU @ 2.00GHz
Memory: 50.58 MB / 8.00 GB
Shell: 5.8.1 - /bin/zsh
Binaries:
Node: 18.4.0 - /usr/local/bin/node
Yarn: 1.22.11 - /usr/local/bin/yarn
npm: 8.12.1 - /usr/local/bin/npm
Browsers:
Chrome: 103.0.5060.114
Firefox: 79.0
Safari: 15.5
npmPackages:
svelte: ^3.46.1 => 3.48.0
Severity
annoyance
Looks like this might be an issue with catastrophic backtracking, where the regex engine tries to backtrack every single one of those 100k spaces to see if it might match the rest of the regex. So that instead of following all 100k spaces, seeing that they don't match the end of the line (because they have a </div>
after them) and backing off the whole 100k, it then tries 100k times to find an end-of-line marker.
I can think of a couple ways to work around the catastrophic backtracking in this case. First, if template
was expected to be a single-line string, trimEnd would likely be more performant as it wouldn't backtrack at all. String.trimEnd is not supported on IE, so the code might need to become something like:
if ('trimEnd' in String.prototype) {
this.template = template.trimEnd();
} else {
this.template = template.replace(/\s+$/, '');
}
(Unless Svelte plans to drop IE support entirely, in which case the if
isn't necessary).
Another approach might be to use the technique described in https://instanceof.me/post/52245507631/regex-emulate-atomic-grouping-with-lookahead to prevent the regex engine from backtracking. Rewrite the /\s+$/
regex as /(?=(\s+))\1$/
and it should (I haven't tested this yet) perform the same in non-pathological cases, yet avoid backtracking in pathological cases. (Because the entire series of spaces will be captured by the parentheses inside the lookahead expression, then when the parser reaches the \1$
marker it's not able to backtrack through the \s+
expression). However, this would need to be tested, as it could possibly cause slowdown in non-pathological cases by creating capture groups for every space found in the template, then throwing those groups away later when it's discovered they weren't needed. (Which would result in GC pressure).
BTW, I traced the origin of the template.replace(/\s+$/, '')
line back to https://github.com/sveltejs/svelte/commit/bac02481b781f4d2509a5361b3d4bfdd47482ab2 (from #153), meaning that that line has been in the parser for a long, long time.
We had to work around this issue as well in language tools by inserting non-whitespacs characters at the end of each line to prevent this from slowing down.
This is a parser issue, so if we can run it on Node.js version 8, the minimum supported version of Node.js, I think it's okay if we can't run it in IE.
However, Node.js v8 does not support trimEnd
. But Node.js v8 seems to be able to use trimRight
.
trimRight
is currently an alias for trimEnd
and is deprecated, so people don't like it very much.
Do you think we can use trimRight
?
https://node.green/#ES2019-features-string-trimming https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/trimEnd
I agree to use trimRight
. It seems to be available in the latest Node 18.6.0 also.
(So trimRight
can use from our minimum supported version Node 8 to the latest version Node18)
Also, we can notice by CI if some environment droptrimRight
in the future.
We only can use built in functions which are part of the node versions Svelte currently supports
The performance should be improved in 3.50.0.