pywb
pywb copied to clipboard
eval() rewriting breaks function-scoped variables
Describe the bug
When pywb rewrites eval it wraps it in a function call. Unfortunately this breaks code which declares function-scoped variables using var and then accesses them outside the eval call. For example this code works in the browser but fails once rewritten by pywb:
eval('var x = 5');
document.write('x is ' + x);
This is a simplified reproducer based on an issue discovered in the wild when attempting to archive https://www.cbns.org.au/ which makes use of this technique as part of the minification in superfly-menu.js (var $M is declared inside an eval() call and then later accessed outside it).
Steps to reproduce the bug
$ cat >test.html <<EOF
<script>
eval('var x = 5');
document.write('x is ' + x);
</script>
EOF
$ warcit http://example.org/ test.html
[INFO] Wrote 1 resources to test.html.warc.gz
$ wb-manager init test
$ wb-manager add test test.html.warc.gz
$ pywb &
$ firefox http://localhost:8080/test/20210524065556/http://example.org/test.html
Replayed page is blank and js console logs error:
Uncaught ReferenceError: x is not defined
<anonymous> http://localhost:8080/test/20210524065556mp_/http://example.org/test.html:85
Expected behavior
Page should display:
x is 5
Environment
- OS: Linux
- Browser Firefox 88.0.1 and Chrome 90.0.4430.93
- Version pywb 2.5.0
@ikreymer noticed that the original site (https://www.cbns.org.au/) replays in archiveweb.page. It looks like the reason for this is that the implementation in jsrewriter.js is slightly different.
In wabac/archiveweb.page when eval is preceded by a comma (,) it's rewritten to WB_wombat_eval instead of the WB_wombat_runEval.
I was going to open a PR copying this change back from wabac.js to pywb but on considering it further I think this is an accidental fix and not the correct solution. eval() really is a direct invocation in this case and the fact it's preceded by a comma in the minified version public.min.js is purely incidental due to minification process as in the original input file superfly-menu.js there is no preceding comma.
I think the correct solution must be for the original eval() to be called in the exact same scope as the original code, i.e. not wrapped in the _____evalIsEvil function. Trouble is I'm uncertain why the wrapping function was introduced in the first place.
Naively, I feel like a direct invocation:
eval(s);
which is currently rewritten to:
WB_wombat_runEval(function _____evalIsEvil(_______eval_arg$$) { return eval(_______eval_arg$$); }.bind(this)).eval(s);
should instead be rewritten to something like:
eval(WB_wombat_rewriteEvalString(s));
but actually implementing that requires implementing a javascript parser as s is a arbitrarily complex expression and so finding where it ends is not feasible with a regex. Thus explaining why the existing implementation is so complicated.