minify
minify copied to clipboard
CSS minification logic for @license and @preserve is wrong, and sometimes very slow
This regexp is both wrong, and can be very, very slow on large input.
'/\n?\/\*(!|.*?@license|.*?@preserve).*?\*\/\n?/s'
https://github.com/matthiasmullie/minify/blob/master/src/CSS.php#L639
The error is that the .*? before the @licence will match anything, including */. Therefore, if your CSS contains:
/* First comment */
.lots .of .css .rules {}
/* @preserve */
Then all the .lots .of .css .rules {} don't get minified.
Also, in the context of Moodle (https://moodle.org/), this regexp (which was being matched only 2 times) was taking 15 seconds to do that!
I have found a way to improve it (time down to a fraction of a second, and all input minified). Pull request coming shortly.
Pull request #318 should fix this. (Don't know why github did no auto-link them.)
And for reference, the Moodle tracker issue for this is https://tracker.moodle.org/browse/MDL-68191.
I tested the test script from #222 and this change for that particular file is actually slower, but that's possibly a side-effect of the output being more correct.
The 1.5M JS file originally got minified to 1.2M and with this PR it got down to 950K, a diff of the output shows that a big chunk of the file previously wasn't getting minified, which probably explains the time difference since more work is being done:
Before/not-fully-minified:
real 0m3.711s user 0m2.867s sys 0m0.825s
After/fully-minified:
real 0m12.541s user 0m7.871s sys 0m4.615s
I was hoping this might fix #222 and #280 since those test files do have these kind of comments but that weird issue still remains when run on certain servers. The times above are from my Macbook, the same test script on the server that was showing the issue for #222 gives times that are more like 1 minute vs 7 minute, with most of the extra time spent in sys.
I'm now working around the issue in #280 by avoiding minifying such large files anyway, so it's not as vital that it is solved.
Hi @davidscotson, this patch https://gist.github.com/timhunt/f969f46f60756197f320e03ed0d27a36 is what I was using in the Moodle context for debugging the performance. There is Hackery in theme/styles.php, so that it does not use the cache, but instead re-minimises every time, and then there are changes in minify to track the number of times each regex is matched, and how long is spent matching it, this is output as a comment in front on the minimised CSS.
This means that you can to to the URL like .../theme/styles.php/themename/1584454172_2/all, and keep reloading, to test changes, and see which regexes are slow. Would be interesting to find what is being slow in your case.
Thanks for that @timhunt, I've run the JS test from #222 with that timing code added and these are the results:
On my Macbook (normal, expected behaviour):
real 0m3.684s
user 0m2.754s
sys 0m0.879s
16663 calls, 0.049189567565918 seconds
/(['"`])(.*?(?<!\\)(\\\\)*+)\1/sS
-------------------------
2 calls, 0.016233921051025 seconds
/\n?\/\*(!|.*?@license|.*?@preserve).*?\*\/\n?/sS
-------------------------
655 calls, 0.0021066665649414 seconds
/\/\*.*?\*\//sS
-------------------------
1063 calls, 0.0024473667144775 seconds
/\/\/.*$/mS
-------------------------
536 calls, 0.14910626411438 seconds
/([=:,;\+\-\*\/\}\(\{\[&\|!]|^|do|in|new|else|throw|yield|delete|return|typeof)\s*\K\/(?!\*)(?:[^\[\/\\\n\r]++|(?:\\.)++|(?:\[(?:[^\]\\\n\r]++|(?:\\.)++)++\])++)++\/[gimuy]*(?=\s*([\.,;\)\}&\|+]|\/\/|$|\.(constructor|flags|global|ignoreCase|multiline|source|sticky|unicode|compile\(|exec\(|test\(|toSource\(|toString\()))/S
-------------------------
1 calls, 0.0024127960205078 seconds
/\)\s*\K\/(?!\*)(?:[^\[\/\\\n\r]++|(?:\\.)++|(?:\[(?:[^\]\\\n\r]++|(?:\\.)++)++\])++)++\/[gimuy]*(?=\s*\.(constructor|flags|global|ignoreCase|multiline|source|sticky|unicode|compile\(|exec\(|test\(|toSource\(|toString\())/S
-------------------------
3 calls, 0.13619804382324 seconds
/\/(?!\*)(?:[^\[\/\\\n\r]++|(?:\\.)++|(?:\[(?:[^\]\\\n\r]++|(?:\\.)++)++\])++)++\/[gimuy]*(?=\s*\n\s*(\*|\/|%|(?<![\+\-\*\\\=\<\>%&\|])\=|\+\=|\-\=|\*\=|\/\=|%\=|\<\<\=|\>\>\=|\>\>\>\=|&\=|\^\=|\|\=|&|\||\^|~|\<\<|\>\>|\>\>\>|\=\=|\=\=\=|\!\=|\!\=\=|\>|\<|\>\=|\<\=|&&|\|\||\!|(?<![0-9]\s)\.|\[|\?|\:|,|;|\(|\{|do|if|in|for|let|new|try|var|case|else|enum|eval|null|this|true|void|with|break|catch|class|const|false|super|throw|while|yield|delete|export|import|public|return|static|switch|typeof|default|extends|finally|package|private|continue|debugger|function|arguments|interface|protected|implements|instanceof|abstract|boolean|byte|char|double|final|float|goto|int|long|native|short|synchronized|throws|transient|volatile))/S
-------------------------
On weird server, with Tim's patch:
real 6m15.683s
user 0m6.048s
sys 6m8.712s
19127 calls, 0.092156887054443 seconds
/(['"`])(.*?(?<!\\)(\\\\)*+)\1/sS
-------------------------
8 calls, 0.001939058303833 seconds
/
# optional newline
\n?
# start comment
\/\*
# comment content
(?:
# either starts with an !
!
|
# or, after some number of characters which do not end the comment
(?:(?!\*\/).)*?
# there is either a @license or @preserve tag
@(?:license|preserve)
)
# then match to the end of the comment
.*?\*\/\n?
/ixsS
-------------------------
787 calls, 0.0035738945007324 seconds
/\/\*.*?\*\//sS
-------------------------
2908 calls, 0.021705627441406 seconds
/\/\/.*$/mS
-------------------------
654 calls, 0.027463912963867 seconds
/([=:,;\+\-\*\/\}\(\{\[&\|!]|^|do|in|new|else|throw|yield|delete|return|typeof)\s*\K\/(?!\*)(?:[^\[\/\\\n\r]++|(?:\\.)++
-------------------------
1 calls, 0.00085091590881348 seconds
/\)\s*\K\/(?!\*)(?:[^\[\/\\\n\r]++|(?:\\.)++|(?:\[(?:[^\]\\\n\r]++|(?:\\.)++)++\])++)++\/[gimuy]*(?=\s*\.(constructor|fl
-------------------------
9 calls, 0.039934635162354 seconds
/\/(?!\*)(?:[^\[\/\\\n\r]++|(?:\\.)++|(?:\[(?:[^\]\\\n\r]++|(?:\\.)++)++\])++)++\/[gimuy]*(?=\s*\n\s*(\*|\/|%|(?<![\+\-\
-------------------------
So, there's slight changes in the numbers due to the improved behaviour, but overall these regexes are about the same speed on the server than on my macbook, but the overall time is much slower, so I'm thinking it's something outside the regexes that's causing the slowness on that server.
@davidscotson it would be easier to interpret the effect of the patch if you could run the two different versions of the code on the same server/laptop.
Looking at the differences, clearly all the time is going into "sys 6m8.712s". That could well be a sign of what was suggested in #222: excessive use of swap. Or, it could be that your server has terrible disc performance, or something.
Anyway, as you say it certainly does not look like the performance of the preg calls has got worse, and that is all this patch changes.