jint icon indicating copy to clipboard operation
jint copied to clipboard

The toString() of 'class' and 'function' return SourceText

Open scgm0 opened this issue 1 year ago • 20 comments

I noticed that the previous reason for returning native code was that Esprima didn't support it (in the ToString comment for Function), but I found that Esprima now supports returning SourceText, so wanted to align with the behaviour of the mainstream js engines. Unfortunately, passing the Test262 test still requires waiting for Esprima to refine the functionality, but that's outside the scope of this pr and beyond my capabilities ( (I accidentally closed the old pr earlier and reset my repository, so I'm re-proposing it now...)

scgm0 avatar Jan 05 '24 17:01 scgm0

@lahma I just tested the branch and while there is still room for improvement as mentioned in by scgm0, this is already a great improvement over the current behaviour. Can we rebase the branch and merge it, or is there something blocking that?

lofcz avatar Feb 07 '24 20:02 lofcz

See the build status.

lahma avatar Feb 07 '24 20:02 lahma

sorry, took another look into the failed tests, all but the ones (90) testing for the native function toString() could be fixed by modifying the Regex used but I don't know how to fix these without Esprima returning the correct function name in the first place.

lofcz avatar Feb 07 '24 21:02 lofcz

There are now only 90 toString tests that have failed. In fact, they cannot pass the toString test now. They pass the "NativeFunction" test in "assertToStringOrNativeFunction" in the main repository.

main jint: function f() { [native code] } 图片 test262: ( /* a */ a /* b */ , /* c */ b /* d */ ) /* e */ => /* f */ { /* g */ ; /* h */ } 图片 this pr: (a, b) => { ; } 图片

scgm0 avatar Feb 08 '24 03:02 scgm0

@lahma Should we skip related tests? Or should we wait for Esprima to update? Does Esprima have any relevant update plans?

scgm0 avatar Feb 08 '24 03:02 scgm0

I think skipping the 90 tests would be reasonable as these cover only function toString() which is wastly improved in this PR. Nit: is there a reason not to precompile the regex used instead of constructing it every time?

lofcz avatar Feb 08 '24 10:02 lofcz

I think skipping the 90 tests would be reasonable as these cover only function toString() which is wastly improved in this PR. Nit: is there a reason not to precompile the regex used instead of constructing it every time?

This is a test code. I can change it anytime.

scgm0 avatar Feb 08 '24 10:02 scgm0

I think skipping the 90 tests would be reasonable as these cover only function toString() which is wastly improved in this PR. Nit: is there a reason not to precompile the regex used instead of constructing it every time?

How about now?

scgm0 avatar Feb 08 '24 11:02 scgm0

@scgm0 thanks @lahma could you have another look at this please? The 90 tests would need to be skipped.

lofcz avatar Feb 08 '24 11:02 lofcz

Changing just the jint no longer made the results more correct, so I updated Test262Harness.settings.json to skip those 90 toString tests

scgm0 avatar Feb 09 '24 06:02 scgm0

I wonder whether we wait for Acornima with this one, as it should provide the information we need from the quick look I took - https://github.com/adams85/acornima

lofcz avatar Feb 18 '24 07:02 lofcz

I wonder whether we wait for Acornima with this one, as it should provide the information we need from the quick look I took - https://github.com/adams85/acornima

It doesn't seem to work. 图片

scgm0 avatar Feb 18 '24 13:02 scgm0

I wonder whether we wait for Acornima with this one, as it should provide the information we need from the quick look I took - https://github.com/adams85/acornima

It doesn't seem to work. 图片

AST to JS was implemented in Acornima today though there are still some rough edges from my testing.

lofcz avatar Feb 21 '24 03:02 lofcz

I wonder whether we wait for Acornima with this one, as it should provide the information we need from the quick look I took - https://github.com/adams85/acornima

It doesn't seem to work. 图片

AST to JS was implemented in Acornima today though there are still some rough edges from my testing.

That fast? @lahma What do you think?

scgm0 avatar Feb 21 '24 06:02 scgm0

Chiming in with some additional thoughts, hope you don't mind.

In my opinion this (i.e. utilizing the AST to JS source gen feature of the parser) is not the right way to solve this problem. Neither Esprima, nor Acornima is able to reproduce the exact source text for reasons thoroughly discussed here.

Instead of the source gen way, I'd suggest a simpler one that neither needs more support from the parser side, nor ugly hacks on the runtime side: Jint just needs to keep around the original script/module code that was parsed. Then, when you need the source text of a function or class, you can get that as a slice of the script/module code, based on the Range property of the related AST node.

The tricky part is where to store the original script/module code so it can be accessed wherever needed. At first glance, something like ExecutionContext.ScriptOrModule looks like a good candidate. However, that doesn't seem to work with dynamic code (eval, Function ctor, shadow realms).

What's even worse, let's consider a script like this: eval('function fn() {}'); fn.toString(). Here the direct eval defines a function in the caller's scope. But the source text of fn must be extracted from the evaled code... 🤯

So, I can't really see a better way than using the OnNode(Created) callback to store a reference to the parsed code string into the relevant node's AssociatedData/UserData property. This looks a bit cumbersome though, as that property is already used by Jint for other purposes. But probably, with some gymnastics, you could make it work.

But @lahma might have an even better idea on storing this piece of information.

adams85 avatar Apr 24 '24 18:04 adams85

Chiming in with some additional thoughts, hope you don't mind.

In my opinion this (i.e. utilizing the AST to JS source gen feature of the parser) is not the right way to solve this problem. Neither Esprima, nor Acornima is able to reproduce the exact source text for reasons thoroughly discussed here.

Instead of the source gen way, I'd suggest a simpler one that neither needs more support from the parser side, nor ugly hacks on the runtime side: Jint just needs to keep around the original script/module code that was parsed. Then, when you need the source text of a function or class, you can get that as a slice of the script/module code, based on the Range property of the related AST node.

The tricky part is where to store the original script/module code so it can be accessed wherever needed. At first glance, something like ExecutionContext.ScriptOrModule looks like a good candidate. However, that doesn't seem to work with dynamic code (eval, Function ctor, shadow realms).

What's even worse, let's consider a script like this: eval('function fn() {}'); fn.toString(). Here the direct eval defines a function in the caller's scope. But the source text of fn must be extracted from the evaled code... 🤯

So, I can't really see a better way than using the OnNode(Created) callback to store a reference to the parsed code string into the relevant node's AssociatedData/UserData property. This looks a bit cumbersome though, as that property is already used by Jint for other purposes. But probably, with some gymnastics, you could make it work.

But @lahma might have an even better idea on storing this piece of information.

In fact, the method of intercepting the original text cannot pass the test262 test, because some tests require that the code returned by toString be generated and formatted by AST, and the original text cannot meet this requirement, so even it now passes AST The generated code is not perfect, but it is at least a formatted result.

scgm0 avatar Sep 15 '24 13:09 scgm0

I'm not sure I follow you... So, are you saying that there are some tests which require the original source text of the function to be regenerated/reformatted? Can you show me such a test?

adams85 avatar Sep 15 '24 17:09 adams85

I'm not sure I follow you... So, are you saying that there are some tests which require the original source text of the function to be regenerated/reformatted? Can you show me such a test?

https://github.com/tc39/test262/blob/main/test/built-ins/Function/prototype/toString/AsyncFunction.js

scgm0 avatar Sep 15 '24 23:09 scgm0

Oh ok, I see now what you are getting at.

Apparently, functions not parsed directly but created dynamically via the Function constructor will need special treatment. In that case the runtime (Jint) should generate the source text of the function. According to my tests, this is done by simply concatenating the parameters passed to the Function constructor:

(async () => {
  async function f() {}
  var AsyncFunction = f.constructor;
  var p1 = "a", p2 = "/*a */ b, c";
  var g = /* before */AsyncFunction(p1, p2, "return Promise.resolve(a+b+c) /* d */ //")/* after */; 
  console.log(g.toString());
  console.log(await g(1, 2, 3));
})()

If you want to verify that there is indeed a simple string concatenation under the hood (at least in V8 and Firefox), run this:

(async () => {
  async function f() {}
  var AsyncFunction = f.constructor;
  var p1 = "a //", p2 = "/*a */ b, c";
  var g = /* before */AsyncFunction(p1, p2, "return Promise.resolve(a+b+c) /* d */ //")/* after */; 
  console.log(g.toString());
  console.log(await g(1, 2, 3));
})()

adams85 avatar Sep 16 '24 18:09 adams85

As far as I can tell, a template like this is used to generate the function source text:

`${isPrototypeAsync ? "async " : ""}function anonymous(${[...arguments].slice(0, arguments.length -1).join(",")}
) {
${arguments[arguments.length -1]}
}`

Probably generator functions are also supported, so it may be a bit more complicated than this.

EDIT: looks like this is already implemented here.

adams85 avatar Sep 16 '24 18:09 adams85