jalangi2
jalangi2 copied to clipboard
ES5 strict mode
I happened to meet in a situation that I could only run instrumented JS file in ES5 strict mode or ES6. Since ES6 is currently unsupported, I use the former configuration. However, assigning arguments is not allowed in strict mode but quite common in the instrumented JS file. Is it possible to support ES5 strict mode? If yes, what changes are needed against Jalangi2 source code?
We cannot support ES5 strict mode because Jalangi 2 uses arguments.callee which cannot be accessed in the strict mode.
@ksen007 would a solution be to strip any use of "use strict"
from an instrumented file?
Stripping "use strict" is still problematic in my case, because I am using Jalangi to instrument code in a webpage's
@gogo9th yes that is a tricky scenario. The only thing I can think of would be to make Jalangi's use of arguments.callee
optional. It's not too obvious to me how to do this. You could try to pass the function explicitly as a separate argument in instrumented code, but that could break code that uses the arguments
array. So maybe we'd have to maintain a separate stack inside Jalangi? I'm open to other ideas.
Thanks for your ideas. I think making arguments.callee
usable is not sufficient, because even if we forcibly modify <script type='module'>
to <script type='text/javascript'>
to make arguments.callee
works, there is another problem: the functions exported from the original <script type='text/javascript'>
(which was previously module-typed) cannot be accessed from another <script type='text/javascript'>
(which was previously module-typed). If a script's type is module, it has complex import & export rules: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules, and exported files cannot be accessed the same way as NodeJS, because a webpage's Javascript is sandboxed by the browser and thus cannot directly access its machine's file system. So I think Jalangi needs a special way of handling import/export if this.window
is defined (i.e., if it is a browser).
I think J$ should have a hierarchical scope management object based on import & export declarations for each sid.
Hi @gogo9th, to clarify, I was proposing to (optionally?) remove usage of arguments.callee
so that strict mode could be enabled. The main use of arguments.callee
is to provide the function value as an argument to the functionEnter
callback, which in my experience is rather important. As a partial alternative, Jalangi could maintain its own call stack, and pass in the appropriate function value to functionEnter
based on that. But, that still won't work for callbacks from native functions. A simpler solution would be to just say that the function value argument in functionEnter
will be null
if the function is in a strict mode script, and leave it to the analysis to deal with it.
I think J$ should have a hierarchical scope management object based on import & export declarations for each sid.
What problem is this solving? If it's not towards fixing handling of strict mode, I'd prefer to leave this to the analysis.
Before you go too deep down this rabbit hole.. I know that @algobardo and @christofferqa also fought with Jalangi and modern EcmaScript features several years ago - perhaps they can drop a quick line about their endeavours?
As a partial alternative, Jalangi could maintain its own call stack, and pass in the appropriate function value to
functionEnter
based on that. But, that still won't work for callbacks from native functions.
I wonder what's an example of this. If we meet a native function, we will not track inside it anyway, so there will be no additional problem we need to worry about, isn't it...?
I wonder what's an example of this. If we meet a native function, we will not track inside it anyway, so there will be no additional problem we need to worry about, isn't it...?
The problem is a native function calling back into a source-code function, e.g., anytime Function.prototype.call
is used to invoke a source-code function.
I came across a possible solution to this problem. In JS you can access the value of a function via its name. What if in Jalangi instrumentation, we introduced a generated name for any anonymous function, and then just always used the function name instead of arguments.callee
? @gogo9th do you see any problems with this idea?
I am very sorry for my late reply, I've found several more bugs/problems in Jalangi2 and have been fixing them. It took me about 100 hours, and I think I fixed all of them. I will organize them and tell you in the other thread we have been discussing.
For your suggestion above, are you suggesting that whenever an anonymous function appears in the source code, we instrument the code such that we store the function's value in a generated name variable (in the global scope), and we replace arguments.callee
in each anonymous function with this generated name variable? If so, I think this will work.
One similar issue I was fixing was about with() statements.
with() statement error: esnstrument.js sets 'this = base'
for M
(method) and F
(function) calls and passes it to callFun()
's Function.prototype.apply.call(targetFn, this, arguments)
. But the base (this
) could be some object in the with()
bracket. If a function/method is called within a with()
block and if the function/method internally uses 'this
', then its result will be computed based on a wrong 'this
' variable value. To solve this, we create a stack of with()
bracket's objects, push into the stack whenever we enter a new with()
block, pop them when we exit, and each callFun()
should check the stacked values. There is a single stack for each scope (either global or function scope).
Also, read & write callback should check if the variable belongs to some object in the with()
bracket.
Is using stacks for with()
the correct was of solving the problem? Would there be any issue or corner cases?
I wonder what's an example of this. If we meet a native function, we will not track inside it anyway, so there will be no additional problem we need to worry about, isn't it...?
The problem is a native function calling back into a source-code function, e.g., anytime
Function.prototype.call
is used to invoke a source-code function.
I'm sorry, but I was not clear why this would be a problem.. You're right that we cannot install a callback on a native function itself. Yet, we can simply skip the native function and install the callback in the source-code function. For example, if we have code like
function SomeParentFunction () {
Function.prototype.call(this, MyFunction, args)```,
}
, then we can instrument the contents of SomeParentFunction
to push into our new global calleeStack
the value of the 2nd argument of Function.prototype.call (i.e., MyFunction
), and instrument the contents of MyFunction
such that when when it begins it first regards the lastly pushed value in the calleeStack
as the value for arguments.callee
. Would there be an issue in this approach?
Meanwhile, I have a different concern about this calleeStack
approach in general (even current Jalangi's returnStack
). When we have an asynchronous function call by Promise, then Jalangi wouldn't have a chance to push into its calleeStack
or returnStack
the value of the callee or caller function.
@gogo9th thanks so much for all these great comments. Just an FYI that I am busy today and possibly tomorrow but I will give feedback here as soon as I can
@gogo9th:
I am very sorry for my late reply, I've found several more bugs/problems in Jalangi2 and have been fixing them. It took me about 100 hours, and I think I fixed all of them. I will organize them and tell you in the other thread we have been discussing.
Sorry for all the bugs. Upstreaming your fixes would be great and very welcome!
For your suggestion above, are you suggesting that whenever an anonymous function appears in the source code, we instrument the code such that we store the function's value in a generated name variable (in the global scope), and we replace
arguments.callee
in each anonymous function with this generated name variable? If so, I think this will work.
I was actually suggesting adding a name for the function itself, rather than generating a variable. Do you think adding a name to anonymous functions is risky?
One similar issue I was fixing was about with() statements.
with() statement error: esnstrument.js sets
'this = base'
forM
(method) andF
(function) calls and passes it tocallFun()
's Function.prototype.apply.call(targetFn, this, arguments)
. But the base (this
) could be some object in thewith()
bracket. If a function/method is called within awith()
block and if the function/method internally uses 'this
', then its result will be computed based on a wrong 'this
' variable value. To solve this, we create a stack ofwith()
bracket's objects, push into the stack whenever we enter a newwith()
block, pop them when we exit, and eachcallFun()
should check the stacked values. There is a single stack for each scope (either global or function scope).Also, read & write callback should check if the variable belongs to some object in the
with()
bracket.Is using stacks for
with()
the correct was of solving the problem? Would there be any issue or corner cases?
At a high level, this sounds reasonable to me, but I don't really know about all the corner cases that could crop up here. I am not really sure if the decision to skip support of with
was due to fundamental issues, or because it didn't seem worth the effort. One question:
There is a single stack for each scope (either global or function scope).
Why is this needed?
, then we can instrument the contents of
SomeParentFunction
to push into our new globalcalleeStack
the value of the 2nd argument of Function.prototype.call (i.e.,MyFunction
), and instrument the contents ofMyFunction
such that when when it begins it first regards the lastly pushed value in thecalleeStack
as the value forarguments.callee
. Would there be an issue in this approach?
I am not sure. But isn't the approach of introducing names for anonymous functions a better solution?
Meanwhile, I have a different concern about this
calleeStack
approach in general (even current Jalangi'sreturnStack
). When we have an asynchronous function call by Promise, then Jalangi wouldn't have a chance to push into itscalleeStack
orreturnStack
the value of the callee or caller function.
Do you have an example here?
There is a single stack for each scope (either global or function scope).
Why is this needed?
We need this, because multiple with()
statements can be nested. In each with()
block, we need to identify the with()
objects belonging to their own with()
scope.
I am not sure. But isn't the approach of introducing names for anonymous functions a better solution? Okay, I will implement this.
Jalangi has another issue with the strict mode. When we have a sample function:
function hello() {}
Jalangi instruments this as follows:
function hello() {
jalangiLabel0:
while (true) {
try {
JALANGI_$.Fe(9, arguments.callee, this, arguments);
arguments = JALANGI_$.N(17, 'arguments', arguments, 4);
} catch (JALANGI_$e) {
JALANGI_$.Ex(49, JALANGI_$e);
} finally {
if (JALANGI_$.Fr(57))
continue jalangiLabel0;
else
return JALANGI_$.Ra();
}
}
}
Here, the line arguments = JALANGI_$.N(17, 'arguments', arguments, 4);
is problematic in a strict mode, because we are not supposed to overwrite the arguments
variable in the strict mode:
SyntaxError: Unexpected eval or arguments in strict mode.
I am actually not sure if we really need to above line in the instrumented code. Can we just remove this line by commenting out the esnstrument.js
file's following section:
function syncDefuns(node, scope, isScript) {
var ret = [], ident;
// if (!isScript) {
// if (!Config.INSTR_TRY_CATCH_ARGUMENTS || Config.INSTR_TRY_CATCH_ARGUMENTS(node)) {
// if (!Config.INSTR_INIT || Config.INSTR_INIT(node)) {
// ident = createIdentifierAst("arguments");
// ret = origin_array_concat.call(ret, createCallInitAsStatement(node,
// createLiteralAst("arguments"),
// ident,
// true,
// ident, false, true));
// }
// }
// }
If a callback function has to change the contents of arguments, it can do it from withing the function-enter callback:
JALANGI_$.Fe(9, arguments.callee, this, arguments);
Tentatively I agree we can remove that callback that overwrites arguments
. We should update the tests to pass if needed, and also document the change for the next release (and maybe do a bigger version bump since it's a change to the Jalangi API).
Thanks for the feedback. I am lastly working on making
I have one question. Why doe Jalangi wrap each script or function code with a while loop? What could go wrong if they are just wrapped by a try-catch block without a while loop?
I have one question. Why doe Jalangi wrap each script or function code with a while loop? What could go wrong if they are just wrapped by a try-catch block without a while loop?
This is to enable backtracking functionality. See slides 40--43 here:
https://manu.sridharan.net/files/JalangiTutorial.pdf
I think it'd be reasonable for you to disable that while loop instrumentation on a branch, if your analyses don't use it. Maybe we should add a flag to optionally disable it.
For anyone still tracking this issue, I attempted to implement partial support for use strict
here:
https://github.com/Samsung/jalangi2/pull/175
It will fail at runtime for some unusual cases involving getters / setters. And it's a real hack right now and needs cleanup. But if anyone has feedback let me know.