jint
jint copied to clipboard
The toString() of 'class' and 'function' return SourceText
I noticed that the previous reason for returning native code was that Esprima didn't support it (in the ToString comment for Function), but I found that Esprima now supports returning SourceText, so wanted to align with the behaviour of the mainstream js engines. Unfortunately, passing the Test262 test still requires waiting for Esprima to refine the functionality, but that's outside the scope of this pr and beyond my capabilities ( (I accidentally closed the old pr earlier and reset my repository, so I'm re-proposing it now...)
@lahma I just tested the branch and while there is still room for improvement as mentioned in by scgm0, this is already a great improvement over the current behaviour. Can we rebase the branch and merge it, or is there something blocking that?
See the build status.
sorry, took another look into the failed tests, all but the ones (90) testing for the native function toString() could be fixed by modifying the Regex used but I don't know how to fix these without Esprima returning the correct function name in the first place.
There are now only 90 toString tests that have failed. In fact, they cannot pass the toString test now. They pass the "NativeFunction" test in "assertToStringOrNativeFunction" in the main repository.
main jint: function f() { [native code] }
test262:
( /* a */ a /* b */ , /* c */ b /* d */ ) /* e */ => /* f */ { /* g */ ; /* h */ }
this pr:
(a, b) => { ; }
@lahma Should we skip related tests? Or should we wait for Esprima to update? Does Esprima have any relevant update plans?
I think skipping the 90 tests would be reasonable as these cover only function toString() which is wastly improved in this PR. Nit: is there a reason not to precompile the regex used instead of constructing it every time?
I think skipping the 90 tests would be reasonable as these cover only function toString() which is wastly improved in this PR. Nit: is there a reason not to precompile the regex used instead of constructing it every time?
This is a test code. I can change it anytime.
I think skipping the 90 tests would be reasonable as these cover only function toString() which is wastly improved in this PR. Nit: is there a reason not to precompile the regex used instead of constructing it every time?
How about now?
@scgm0 thanks @lahma could you have another look at this please? The 90 tests would need to be skipped.
Changing just the jint no longer made the results more correct, so I updated Test262Harness.settings.json to skip those 90 toString tests
I wonder whether we wait for Acornima with this one, as it should provide the information we need from the quick look I took - https://github.com/adams85/acornima
I wonder whether we wait for Acornima with this one, as it should provide the information we need from the quick look I took - https://github.com/adams85/acornima
It doesn't seem to work.
I wonder whether we wait for Acornima with this one, as it should provide the information we need from the quick look I took - https://github.com/adams85/acornima
It doesn't seem to work.
AST to JS was implemented in Acornima today though there are still some rough edges from my testing.
I wonder whether we wait for Acornima with this one, as it should provide the information we need from the quick look I took - https://github.com/adams85/acornima
It doesn't seem to work.
AST to JS was implemented in Acornima today though there are still some rough edges from my testing.
That fast? @lahma What do you think?
Chiming in with some additional thoughts, hope you don't mind.
In my opinion this (i.e. utilizing the AST to JS source gen feature of the parser) is not the right way to solve this problem. Neither Esprima, nor Acornima is able to reproduce the exact source text for reasons thoroughly discussed here.
Instead of the source gen way, I'd suggest a simpler one that neither needs more support from the parser side, nor ugly hacks on the runtime side: Jint just needs to keep around the original script/module code that was parsed. Then, when you need the source text of a function or class, you can get that as a slice of the script/module code, based on the Range
property of the related AST node.
The tricky part is where to store the original script/module code so it can be accessed wherever needed. At first glance, something like ExecutionContext.ScriptOrModule
looks like a good candidate. However, that doesn't seem to work with dynamic code (eval
, Function
ctor, shadow realms).
What's even worse, let's consider a script like this: eval('function fn() {}'); fn.toString()
. Here the direct eval defines a function in the caller's scope. But the source text of fn
must be extracted from the evaled code... 🤯
So, I can't really see a better way than using the OnNode(Created)
callback to store a reference to the parsed code string into the relevant node's AssociatedData
/UserData
property. This looks a bit cumbersome though, as that property is already used by Jint for other purposes. But probably, with some gymnastics, you could make it work.
But @lahma might have an even better idea on storing this piece of information.
Chiming in with some additional thoughts, hope you don't mind.
In my opinion this (i.e. utilizing the AST to JS source gen feature of the parser) is not the right way to solve this problem. Neither Esprima, nor Acornima is able to reproduce the exact source text for reasons thoroughly discussed here.
Instead of the source gen way, I'd suggest a simpler one that neither needs more support from the parser side, nor ugly hacks on the runtime side: Jint just needs to keep around the original script/module code that was parsed. Then, when you need the source text of a function or class, you can get that as a slice of the script/module code, based on the
Range
property of the related AST node.The tricky part is where to store the original script/module code so it can be accessed wherever needed. At first glance, something like
ExecutionContext.ScriptOrModule
looks like a good candidate. However, that doesn't seem to work with dynamic code (eval
,Function
ctor, shadow realms).What's even worse, let's consider a script like this:
eval('function fn() {}'); fn.toString()
. Here the direct eval defines a function in the caller's scope. But the source text offn
must be extracted from the evaled code... 🤯So, I can't really see a better way than using the
OnNode(Created)
callback to store a reference to the parsed code string into the relevant node'sAssociatedData
/UserData
property. This looks a bit cumbersome though, as that property is already used by Jint for other purposes. But probably, with some gymnastics, you could make it work.But @lahma might have an even better idea on storing this piece of information.
In fact, the method of intercepting the original text cannot pass the test262
test, because some tests require that the code returned by toString
be generated and formatted by AST
, and the original text cannot meet this requirement, so even it now passes AST
The generated code is not perfect, but it is at least a formatted result.
I'm not sure I follow you... So, are you saying that there are some tests which require the original source text of the function to be regenerated/reformatted? Can you show me such a test?
I'm not sure I follow you... So, are you saying that there are some tests which require the original source text of the function to be regenerated/reformatted? Can you show me such a test?
https://github.com/tc39/test262/blob/main/test/built-ins/Function/prototype/toString/AsyncFunction.js
Oh ok, I see now what you are getting at.
Apparently, functions not parsed directly but created dynamically via the Function
constructor will need special treatment. In that case the runtime (Jint) should generate the source text of the function. According to my tests, this is done by simply concatenating the parameters passed to the Function
constructor:
(async () => {
async function f() {}
var AsyncFunction = f.constructor;
var p1 = "a", p2 = "/*a */ b, c";
var g = /* before */AsyncFunction(p1, p2, "return Promise.resolve(a+b+c) /* d */ //")/* after */;
console.log(g.toString());
console.log(await g(1, 2, 3));
})()
If you want to verify that there is indeed a simple string concatenation under the hood (at least in V8 and Firefox), run this:
(async () => {
async function f() {}
var AsyncFunction = f.constructor;
var p1 = "a //", p2 = "/*a */ b, c";
var g = /* before */AsyncFunction(p1, p2, "return Promise.resolve(a+b+c) /* d */ //")/* after */;
console.log(g.toString());
console.log(await g(1, 2, 3));
})()
As far as I can tell, a template like this is used to generate the function source text:
`${isPrototypeAsync ? "async " : ""}function anonymous(${[...arguments].slice(0, arguments.length -1).join(",")}
) {
${arguments[arguments.length -1]}
}`
Probably generator functions are also supported, so it may be a bit more complicated than this.
EDIT: looks like this is already implemented here.