ecma262 icon indicating copy to clipboard operation
ecma262 copied to clipboard

Normative: specify web reality for Array.prototype.join with cyclic values

Open ljharb opened this issue 6 years ago • 14 comments

Fixes #289.

This is an attempt, based on my reading of #289, to specify web reality for Array.prototype.join when the array is cyclic.

I'm using a slot on the current realm to keep a "seen" List, which I hope will avoid the reentrancy issues discussed here.

I have no strong opinion about this PR implementation, so if anyone has any better suggestions I am more than happy to accommodate them! Reviews are welcome.

ljharb avatar Apr 26 '19 00:04 ljharb

can this be annex b?

also can [[Seen]] be an internal slot on %Array.prototype.join% or renamed to like [[ArrayJoinSeen]] (preferably the first one)

devsnek avatar Apr 26 '19 01:04 devsnek

Sure, i like the idea of storing it on the function instead of the realm. I’ll update that.

As for annex B, i suppose, but generally that’s not just for every web reality change, but instead for the ones that are undesired in the language. What’s the argument for that?

ljharb avatar Apr 26 '19 01:04 ljharb

I've updated to store the slot on the function itself; i've used the intrinsic notation in #1376 (if this is ready to land before that one, I'll update to add a new intrinsic for join)

ljharb avatar Apr 26 '19 06:04 ljharb

for annex b reasoning, I think it throwing is more reasonable (like JSON.stringify and such) although it was brought up that it's not explicitly set to throw on a circular, but all the implementations would stack overflow if they didn't drop the value to an empty string.

devsnek avatar Apr 26 '19 11:04 devsnek

Are we sure that this Seen Set cannot be used as a global communications channel?

erights avatar Apr 26 '19 15:04 erights

I'd prefer that this not be in Annex B. Separation of parts of the specification into Annex B makes it harder to read, and it doesn't make it any more web-compatible to not ship the semantics.

littledan avatar Apr 26 '19 17:04 littledan

Strong +1 to working on this; thank you @ljharb for taking up this important work.

Also strong +1 to keeping this in the main spec.

The main question here seem to be how well this matches web reality. @ljharb, do you have a test suite perchance, with a comparison of different engines plus this spec?

domenic avatar Apr 26 '19 17:04 domenic

Not quite yet - I wanted to first get a rough consensus on the needed spec changes, before attempting to write test262 tests (and then gathering their results with eshost).

Quite certainly the only way this PR would be feasible is if it matches web reality; if the tests indicate it doesn't, I'll iterate on both until I've got something that does.

ljharb avatar Apr 26 '19 17:04 ljharb

The proposed change doesn't match web-reality (*) and is missing abrupt completion handling. Was there anything wrong with the patch linked in https://github.com/tc39/ecma262/issues/289#issuecomment-172105022 that it can't be used? (Apart from the potential issues mentioned in https://gist.github.com/anba/4a154bd7143bf2bab3ef#gistcomment-1671398. :smile: )

And as mentioned in https://github.com/tc39/ecma262/issues/289#issuecomment-172112177, there are a couple of small differences between all engines when to check for cycles resp. which objects are eligible for the cycle check.

So I'd say the following questions need to be answered first:

  • Is there a single list for Array.prototype.join and Array.prototype.toLocaleString or separate ones? (I think all engines use a single shared list for join and toLocaleString.)
  • When to apply the cycle check (before or after ToString(separator))? (**)
  • Do we want to have a per Realm or per Agent list? (***)

(*) For example it'll print pre,pre,,,post,,post for the following test case, whereas all engines print pre,,,post for that test.

var a = ["pre", null, null, "post"];
a[1] = a;
a[2] = a;
print(a.join());

(**) Prints "ToString(separator)" in V8/Chakra, but nothing in JSC/SpiderMonkey.

Array.prototype.toString = function() {
    return this.join({
        toString() {
            print("ToString(separator)");
            return ",";
        }
    });
};
var a = []; a.push(a); a.join();

(***) "pre1,pre2,pre1,,post1,post2,post1" in Chakra/V8, but "pre1,pre2,,post2,post1" in JSC/SpiderMonkey.

if (typeof newGlobal !== "function") {
    if (typeof createGlobalObject === "function") {
        var newGlobal = createGlobalObject;
    } else if (typeof WScript === "object") {
        var newGlobal = function() {
            return WScript.LoadScript("this", "samethread");
        };
    } else if (typeof Realm === "object") {
        var newGlobal = function() {
            return Realm.global(Realm.createAllowCrossRealmAccess());
        };
    }
}

var g = newGlobal();

var a = [];
var a2 = g.eval("[]");

a.push("pre1", a2, "post1");
a2.push("pre2", a, "post2");

print(g.Array.prototype.join.call(a));

anba avatar Apr 26 '19 18:04 anba

@anba i just missed it :-) do you have interest in taking over this PR? You clearly have much more context than i do.

A shared list seems fine to me; I’d assume per realm and not per agent; we could apply the check either before or after, but if the result will be ignored I’d assume before is preferable.

ljharb avatar Apr 26 '19 18:04 ljharb

This diff should work I think (lightly tested in engine262)

diff --git a/spec.html b/spec.html
index 507111a..3325d1f 100644
--- a/spec.html
+++ b/spec.html
@@ -33879,22 +33879,36 @@ THH:mm:ss.sss
         <p>The `join` method takes one argument, _separator_, and performs the following steps:</p>
         <emu-alg>
           1. Let _O_ be ? ToObject(*this* value).
-          1. Let _len_ be ? LengthOfArrayLike(_O_).
-          1. If _separator_ is *undefined*, let _sep_ be the single-element String *","*.
-          1. Else, let _sep_ be ? ToString(_separator_).
-          1. Let _R_ be the empty String.
-          1. Let _k_ be 0.
-          1. Repeat, while _k_ &lt; _len_,
-            1. If _k_ &gt; 0, set _R_ to the string-concatenation of _R_ and _sep_.
-            1. Let _element_ be ? Get(_O_, ! ToString(_k_)).
-            1. If _element_ is *undefined* or *null*, let _next_ be the empty String; otherwise, let _next_ be ? ToString(_element_).
-            1. Set _R_ to the string-concatenation of _R_ and _next_.
-            1. Set _k_ to _k_ + 1.
+          1. Let _realm_ be the current Realm Record.
+          1. If _O_ is in _realm_.[[ArrayJoinCycleSet]], then
+            1. Return *""*.
+          1. Append _O_ to _realm_.[[ArrayJoinCycleSet]].
+          1. Let _R_ be ArrayJoin(_O_, _separator_).
+          1. Let _last_ be the last element of _realm_.[[ArrayJoinCycleSet]].
+          1. Assert: _last_ is _O_.
+          1. Remove the last element of _realm_.[[ArrayJoinCycleSet]].
           1. Return _R_.
         </emu-alg>
         <emu-note>
           <p>The `join` function is intentionally generic; it does not require that its *this* value be an Array object. Therefore, it can be transferred to other kinds of objects for use as a method.</p>
         </emu-note>
+
+        <emu-clause id="ArrayJoin" aoid="ArrayJoin">
+          <h1>ArrayJoin ( _O_ , _separator_ )</h1>
+          <emu-alg>
+            1. Let _len_ be ? LengthOfArrayLike(_O_).
+            1. If _separator_ is *undefined*, let _sep_ be the single-element String *","*.
+            1. Else, let _sep_ be ? ToString(_separator_).
+            1. Let _R_ be the empty String.
+            1. Let _k_ be 0.
+            1. Repeat, while _k_ &lt; _len_,
+              1. If _k_ &gt; 0, set _R_ to the string-concatenation of _R_ and _sep_.
+              1. Let _element_ be ? Get(_O_, ! ToString(_k_)).
+              1. If _element_ is *undefined* or *null*, let _next_ be the empty String; otherwise, let _next_ be ? ToString(_element_).
+              1. Set _R_ to the string-concatenation of _R_ and _next_.
+              1. Set _k_ to _k_ + 1.
+            1. Return _R_.
+          </emu-alg>
+        </emu-clause>
       </emu-clause>
 
       <emu-clause id="sec-array.prototype.keys">
@@ -34464,18 +34478,13 @@ THH:mm:ss.sss
         <p>The following steps are taken:</p>
         <emu-alg>
           1. Let _array_ be ? ToObject(*this* value).
-          1. Let _len_ be ? LengthOfArrayLike(_array_).
-          1. Let _separator_ be the String value for the list-separator String appropriate for the host environment's current locale (this is derived in an implementation-defined way).
-          1. Let _R_ be the empty String.
-          1. Let _k_ be 0.
-          1. Repeat, while _k_ &lt; _len_,
-            1. If _k_ &gt; 0, then
-              1. Set _R_ to the string-concatenation of _R_ and _separator_.
-            1. Let _nextElement_ be ? Get(_array_, ! ToString(_k_)).
-            1. If _nextElement_ is not *undefined* or *null*, then
-              1. Let _S_ be ? ToString(? Invoke(_nextElement_, *"toLocaleString"*)).
-              1. Set _R_ to the string-concatenation of _R_ and _S_.
-            1. Set _k_ to _k_ + 1.
+          1. Let _realm_ be the current Realm Record.
+          1. If _array_ is in _realm_.[[ArrayJoinCycleSet]], then
+            1. Return *""*.
+          1. Append _array_ to _realm_.[[ArrayJoinCycleSet]].
+          1. Let _R_ be ArrayToLocaleString(_array_).
+          1. Let _last_ be the last element of _realm_.[[ArrayJoinCycleSet]].
+          1. Assert: _last_ is _array_.
+          1. Remove the last element of _realm_.[[ArrayJoinCycleSet]].
           1. Return _R_.
         </emu-alg>
         <emu-note>
@@ -34484,6 +34493,25 @@ THH:mm:ss.sss
         <emu-note>
           <p>The `toLocaleString` function is intentionally generic; it does not require that its *this* value be an Array object. Therefore it can be transferred to other kinds of objects for use as a method.</p>
         </emu-note>
+
+        <emu-clause id="ArrayToLocaleString" aoid="ArrayToLocaleString">
+          <h1>ArrayToLocaleString ( _array_ )</h1>
+          <emu-alg>
+            1. Let _len_ be ? LengthOfArrayLike(_array_).
+            1. Let _separator_ be the String value for the list-separator String appropriate for the host environment's current locale (this is derived in an implementation-defined way).
+            1. Let _R_ be the empty String.
+            1. Let _k_ be 0.
+            1. Repeat, while _k_ &lt; _len_,
+              1. If _k_ &gt; 0, then
+                1. Set _R_ to the string-concatenation of _R_ and _separator_.
+              1. Let _nextElement_ be ? Get(_array_, ! ToString(_k_)).
+              1. If _nextElement_ is not *undefined* or *null*, then
+                1. Let _S_ be ? ToString(? Invoke(_nextElement_, *"toLocaleString"*)).
+                1. Set _R_ to the string-concatenation of _R_ and _S_.
+              1. Set _k_ to _k_ + 1.
+            1. Return _R_.
+          </emu-alg>
+        </emu-clause>
       </emu-clause>
 
       <emu-clause id="sec-array.prototype.tostring">

devsnek avatar Jun 29 '20 17:06 devsnek

How is that diff different from https://github.com/tc39/ecma262/pull/1518#issuecomment-487159471?

anba avatar Jun 29 '20 17:06 anba

@anba i think it's pretty much the same as the diff you posted. Aside from the differences between engines that you mention in your comment above, it seems to match how engines actually behave from my testing (for example, storing the cyclic stack per-realm).

devsnek avatar Jun 29 '20 17:06 devsnek

Instead of adding a slot for a seen list on the intrinsic, it should be on the agent. It cannot be on the intrinsic because that would not catch cycles across realms (but they can obviously exist since objects can be shared across realms).

michaelficarra avatar Dec 20 '23 22:12 michaelficarra