jalangi2 ES5 strict mode

I happened to meet in a situation that I could only run instrumented JS file in ES5 strict mode or ES6. Since ES6 is currently unsupported, I use the former configuration. However, assigning arguments is not allowed in strict mode but quite common in the instrumented JS file. Is it possible to support ES5 strict mode? If yes, what changes are needed against Jalangi2 source code?

Nov 28 '16 04:11 liz425

We cannot support ES5 strict mode because Jalangi 2 uses arguments.callee which cannot be accessed in the strict mode.

Dec 01 '16 16:12 ksen007

@ksen007 would a solution be to strip any use of "use strict" from an instrumented file?

Dec 01 '16 16:12 msridhar

Stripping "use strict" is still problematic in my case, because I am using Jalangi to instrument code in a webpage's

Nov 22 '20 18:11 gogo9th

@gogo9th yes that is a tricky scenario. The only thing I can think of would be to make Jalangi's use of arguments.callee optional. It's not too obvious to me how to do this. You could try to pass the function explicitly as a separate argument in instrumented code, but that could break code that uses the arguments array. So maybe we'd have to maintain a separate stack inside Jalangi? I'm open to other ideas.

Nov 22 '20 20:11 msridhar

Thanks for your ideas. I think making arguments.callee usable is not sufficient, because even if we forcibly modify <script type='module'> to <script type='text/javascript'> to make arguments.callee works, there is another problem: the functions exported from the original <script type='text/javascript'> (which was previously module-typed) cannot be accessed from another <script type='text/javascript'> (which was previously module-typed). If a script's type is module, it has complex import & export rules: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules, and exported files cannot be accessed the same way as NodeJS, because a webpage's Javascript is sandboxed by the browser and thus cannot directly access its machine's file system. So I think Jalangi needs a special way of handling import/export if this.window is defined (i.e., if it is a browser).

Nov 23 '20 06:11 gogo9th

I think J$ should have a hierarchical scope management object based on import & export declarations for each sid.

Nov 23 '20 08:11 gogo9th

Hi @gogo9th, to clarify, I was proposing to (optionally?) remove usage of arguments.callee so that strict mode could be enabled. The main use of arguments.callee is to provide the function value as an argument to the functionEnter callback, which in my experience is rather important. As a partial alternative, Jalangi could maintain its own call stack, and pass in the appropriate function value to functionEnter based on that. But, that still won't work for callbacks from native functions. A simpler solution would be to just say that the function value argument in functionEnter will be null if the function is in a strict mode script, and leave it to the analysis to deal with it.

I think J$ should have a hierarchical scope management object based on import & export declarations for each sid.

What problem is this solving? If it's not towards fixing handling of strict mode, I'd prefer to leave this to the analysis.

Nov 23 '20 18:11 msridhar

Before you go too deep down this rabbit hole.. I know that @algobardo and @christofferqa also fought with Jalangi and modern EcmaScript features several years ago - perhaps they can drop a quick line about their endeavours?

Nov 23 '20 19:11 esbena

As a partial alternative, Jalangi could maintain its own call stack, and pass in the appropriate function value to functionEnter based on that. But, that still won't work for callbacks from native functions.

I wonder what's an example of this. If we meet a native function, we will not track inside it anyway, so there will be no additional problem we need to worry about, isn't it...?

Dec 19 '20 15:12 gogo9th

I wonder what's an example of this. If we meet a native function, we will not track inside it anyway, so there will be no additional problem we need to worry about, isn't it...?

The problem is a native function calling back into a source-code function, e.g., anytime Function.prototype.call is used to invoke a source-code function.

Dec 19 '20 21:12 msridhar

I came across a possible solution to this problem. In JS you can access the value of a function via its name. What if in Jalangi instrumentation, we introduced a generated name for any anonymous function, and then just always used the function name instead of arguments.callee? @gogo9th do you see any problems with this idea?

Mar 19 '21 15:03 msridhar

I am very sorry for my late reply, I've found several more bugs/problems in Jalangi2 and have been fixing them. It took me about 100 hours, and I think I fixed all of them. I will organize them and tell you in the other thread we have been discussing.

For your suggestion above, are you suggesting that whenever an anonymous function appears in the source code, we instrument the code such that we store the function's value in a generated name variable (in the global scope), and we replace arguments.callee in each anonymous function with this generated name variable? If so, I think this will work.

Apr 05 '21 00:04 gogo9th

One similar issue I was fixing was about with() statements.

with() statement error: esnstrument.js sets 'this = base' for M(method) and F(function) calls and passes it to callFun()'s Function.prototype.apply.call(targetFn, this, arguments). But the base (this) could be some object in the with() bracket. If a function/method is called within a with() block and if the function/method internally uses 'this', then its result will be computed based on a wrong 'this' variable value. To solve this, we create a stack of with() bracket's objects, push into the stack whenever we enter a new with() block, pop them when we exit, and each callFun() should check the stacked values. There is a single stack for each scope (either global or function scope).

Also, read & write callback should check if the variable belongs to some object in the with() bracket.

Is using stacks for with() the correct was of solving the problem? Would there be any issue or corner cases?

Apr 05 '21 00:04 gogo9th

I wonder what's an example of this. If we meet a native function, we will not track inside it anyway, so there will be no additional problem we need to worry about, isn't it...?

The problem is a native function calling back into a source-code function, e.g., anytime Function.prototype.call is used to invoke a source-code function.

I'm sorry, but I was not clear why this would be a problem.. You're right that we cannot install a callback on a native function itself. Yet, we can simply skip the native function and install the callback in the source-code function. For example, if we have code like

function SomeParentFunction () {
Function.prototype.call(this, MyFunction, args)```,
}

, then we can instrument the contents of SomeParentFunction to push into our new global calleeStack the value of the 2nd argument of Function.prototype.call (i.e., MyFunction), and instrument the contents of MyFunction such that when when it begins it first regards the lastly pushed value in the calleeStack as the value for arguments.callee. Would there be an issue in this approach?

Meanwhile, I have a different concern about this calleeStack approach in general (even current Jalangi's returnStack). When we have an asynchronous function call by Promise, then Jalangi wouldn't have a chance to push into its calleeStack or returnStack the value of the callee or caller function.

Apr 05 '21 07:04 gogo9th

@gogo9th thanks so much for all these great comments. Just an FYI that I am busy today and possibly tomorrow but I will give feedback here as soon as I can

Apr 05 '21 16:04 msridhar

@gogo9th:

I am very sorry for my late reply, I've found several more bugs/problems in Jalangi2 and have been fixing them. It took me about 100 hours, and I think I fixed all of them. I will organize them and tell you in the other thread we have been discussing.

Sorry for all the bugs. Upstreaming your fixes would be great and very welcome!

For your suggestion above, are you suggesting that whenever an anonymous function appears in the source code, we instrument the code such that we store the function's value in a generated name variable (in the global scope), and we replace arguments.callee in each anonymous function with this generated name variable? If so, I think this will work.

I was actually suggesting adding a name for the function itself, rather than generating a variable. Do you think adding a name to anonymous functions is risky?

One similar issue I was fixing was about with() statements.

with() statement error: esnstrument.js sets 'this = base' for M(method) and F(function) calls and passes it to callFun()'s Function.prototype.apply.call(targetFn, this, arguments). But the base (this) could be some object in the with() bracket. If a function/method is called within a with() block and if the function/method internally uses 'this', then its result will be computed based on a wrong 'this' variable value. To solve this, we create a stack of with() bracket's objects, push into the stack whenever we enter a new with() block, pop them when we exit, and each callFun() should check the stacked values. There is a single stack for each scope (either global or function scope).

Also, read & write callback should check if the variable belongs to some object in the with() bracket.

Is using stacks for with() the correct was of solving the problem? Would there be any issue or corner cases?

At a high level, this sounds reasonable to me, but I don't really know about all the corner cases that could crop up here. I am not really sure if the decision to skip support of with was due to fundamental issues, or because it didn't seem worth the effort. One question:

There is a single stack for each scope (either global or function scope).

Why is this needed?

, then we can instrument the contents of SomeParentFunction to push into our new global calleeStack the value of the 2nd argument of Function.prototype.call (i.e., MyFunction), and instrument the contents of MyFunction such that when when it begins it first regards the lastly pushed value in the calleeStack as the value for arguments.callee. Would there be an issue in this approach?

I am not sure. But isn't the approach of introducing names for anonymous functions a better solution?

Meanwhile, I have a different concern about this calleeStack approach in general (even current Jalangi's returnStack). When we have an asynchronous function call by Promise, then Jalangi wouldn't have a chance to push into its calleeStack or returnStack the value of the callee or caller function.

Do you have an example here?

Apr 06 '21 22:04 msridhar

There is a single stack for each scope (either global or function scope).

Why is this needed?

We need this, because multiple with() statements can be nested. In each with() block, we need to identify the with() objects belonging to their own with() scope.

I am not sure. But isn't the approach of introducing names for anonymous functions a better solution? Okay, I will implement this.

Apr 07 '21 14:04 gogo9th

Jalangi has another issue with the strict mode. When we have a sample function:

function hello() {}

Jalangi instruments this as follows:

function hello() {
    jalangiLabel0:
        while (true) {
            try {
                JALANGI_$.Fe(9, arguments.callee, this, arguments);
                arguments = JALANGI_$.N(17, 'arguments', arguments, 4);
            } catch (JALANGI_$e) {
                JALANGI_$.Ex(49, JALANGI_$e);
            } finally { 
                if (JALANGI_$.Fr(57))
                    continue jalangiLabel0;
                else
                    return JALANGI_$.Ra();
            } 
        }
}

Here, the line arguments = JALANGI_$.N(17, 'arguments', arguments, 4); is problematic in a strict mode, because we are not supposed to overwrite the arguments variable in the strict mode:

SyntaxError: Unexpected eval or arguments in strict mode.

I am actually not sure if we really need to above line in the instrumented code. Can we just remove this line by commenting out the esnstrument.js file's following section:

  function syncDefuns(node, scope, isScript) {
    var ret = [], ident;
//    if (!isScript) {
//      if (!Config.INSTR_TRY_CATCH_ARGUMENTS || Config.INSTR_TRY_CATCH_ARGUMENTS(node)) {
//        if (!Config.INSTR_INIT || Config.INSTR_INIT(node)) {
//          ident = createIdentifierAst("arguments");
//          ret = origin_array_concat.call(ret, createCallInitAsStatement(node,
//            createLiteralAst("arguments"),
//            ident,
//            true,
//            ident, false, true));
//        }
//      }
//    }

If a callback function has to change the contents of arguments, it can do it from withing the function-enter callback:

JALANGI_$.Fe(9, arguments.callee, this, arguments);

Apr 07 '21 14:04 gogo9th

Tentatively I agree we can remove that callback that overwrites arguments. We should update the tests to pass if needed, and also document the change for the next release (and maybe do a bigger version bump since it's a change to the Jalangi API).

Apr 07 '21 17:04 msridhar

Thanks for the feedback. I am lastly working on making

I have one question. Why doe Jalangi wrap each script or function code with a while loop? What could go wrong if they are just wrapped by a try-catch block without a while loop?

Apr 10 '21 19:04 gogo9th

I have one question. Why doe Jalangi wrap each script or function code with a while loop? What could go wrong if they are just wrapped by a try-catch block without a while loop?

This is to enable backtracking functionality. See slides 40--43 here:

https://manu.sridharan.net/files/JalangiTutorial.pdf

I think it'd be reasonable for you to disable that while loop instrumentation on a branch, if your analyses don't use it. Maybe we should add a flag to optionally disable it.

Apr 14 '21 17:04 msridhar

For anyone still tracking this issue, I attempted to implement partial support for use strict here:

https://github.com/Samsung/jalangi2/pull/175

It will fail at runtime for some unusual cases involving getters / setters. And it's a real hack right now and needs cleanup. But if anyone has feedback let me know.

Aug 30 '22 00:08 msridhar

jalangi2 jalangi2 copied to clipboard

ES5 strict mode

jalangi2
jalangi2 copied to clipboard