pywb icon indicating copy to clipboard operation
pywb copied to clipboard

eval() rewriting breaks function-scoped variables

Open ato opened this issue 4 years ago • 1 comments

Describe the bug

When pywb rewrites eval it wraps it in a function call. Unfortunately this breaks code which declares function-scoped variables using var and then accesses them outside the eval call. For example this code works in the browser but fails once rewritten by pywb:

eval('var x = 5');
document.write('x is ' + x);

This is a simplified reproducer based on an issue discovered in the wild when attempting to archive https://www.cbns.org.au/ which makes use of this technique as part of the minification in superfly-menu.js (var $M is declared inside an eval() call and then later accessed outside it).

Steps to reproduce the bug

$ cat >test.html <<EOF
<script>
eval('var x = 5');
document.write('x is ' + x);
</script>
EOF
$ warcit http://example.org/ test.html
[INFO] Wrote 1 resources to test.html.warc.gz
$ wb-manager init test
$ wb-manager add test test.html.warc.gz
$ pywb &
$ firefox http://localhost:8080/test/20210524065556/http://example.org/test.html

Replayed page is blank and js console logs error:

Uncaught ReferenceError: x is not defined
    <anonymous> http://localhost:8080/test/20210524065556mp_/http://example.org/test.html:85

Expected behavior

Page should display:

x is 5

Environment

  • OS: Linux
  • Browser Firefox 88.0.1 and Chrome 90.0.4430.93
  • Version pywb 2.5.0

ato avatar May 24 '21 07:05 ato

@ikreymer noticed that the original site (https://www.cbns.org.au/) replays in archiveweb.page. It looks like the reason for this is that the implementation in jsrewriter.js is slightly different.

In wabac/archiveweb.page when eval is preceded by a comma (,) it's rewritten to WB_wombat_eval instead of the WB_wombat_runEval.

I was going to open a PR copying this change back from wabac.js to pywb but on considering it further I think this is an accidental fix and not the correct solution. eval() really is a direct invocation in this case and the fact it's preceded by a comma in the minified version public.min.js is purely incidental due to minification process as in the original input file superfly-menu.js there is no preceding comma.

I think the correct solution must be for the original eval() to be called in the exact same scope as the original code, i.e. not wrapped in the _____evalIsEvil function. Trouble is I'm uncertain why the wrapping function was introduced in the first place.

Naively, I feel like a direct invocation:

eval(s);

which is currently rewritten to:

WB_wombat_runEval(function _____evalIsEvil(_______eval_arg$$) { return eval(_______eval_arg$$); }.bind(this)).eval(s);

should instead be rewritten to something like:

    eval(WB_wombat_rewriteEvalString(s));

but actually implementing that requires implementing a javascript parser as s is a arbitrarily complex expression and so finding where it ends is not feasible with a regex. Thus explaining why the existing implementation is so complicated.

ato avatar May 26 '21 05:05 ato