ecmascript_simd icon indicating copy to clipboard operation
ecmascript_simd copied to clipboard

Polyfill load index only accepts Int32 values, spec accepts 0 - 2^53-1

Open stoklund opened this issue 8 years ago • 13 comments

The polyfill requires the index argument to load and store functions be an Int32

function simdLoad(type, tarray, index, count) {
  if (!isTypedArray(tarray))
    throw new TypeError("The 1st argument must be a typed array.");
  if (!isInt32(index))
    throw new TypeError("The 2nd argument must be an Int32.");

But the spec language seems to require that the index is a Number with an integer value in the range 0-2^53-1. This affects which values throw a TypeError and which values throw a RangeError. (And of course the largest indexable TypedArray).

stoklund avatar Feb 28 '16 18:02 stoklund

The spec language is a bit surprising:

  1. If index ≠ ToLength(index), throw a TypeError exception.
  2. Let elementLength be tarray.[[ByteLength]] ÷ tarray.[[ArrayLength]].
  3. Let byteIndex be index × elementLength.
  4. If byteIndex + descriptor.[[ElementSize]] × length > tarray.[[ByteLength]] or byteIndex < 0, throw a RangeError exception.

This language implies that a negative integer index causes a TypeError while a positive out-of-bounds index causes a RangeError, and the check for byteIndex < 0 will never be true. Compare the DataView functions https://tc39.github.io/ecma262/#sec-getviewvalue which throw a RangeError in both cases.

Would it be more consistent to say:

  1. If index ≠ ToInteger(index), throw a TypeError exception.

stoklund avatar Feb 29 '16 22:02 stoklund

Agreed. This seems like a bug in the spec text @littledan

johnmccutchan avatar Feb 29 '16 22:02 johnmccutchan

FWIW, the Shared Memory and Atomics spec does this differently and it might be a good idea for these candidate specs to agree on how to handle bad arguments. Specifically, all the accessors on the Atomics object throw a TypeError if the first (TypedArray) argument is not a TypedArray of an acceptable type, and otherwise throw a RangeError if the index is wrong, spec here for the latter case. Specifically it throws a RangeError even if the index is some gibberish.

What I have in the atomics spec is modeled on existing ES6 machinery, as far as I could make it out, with the whole PropertyKey machinery.

(I'm not claiming to have the right answer, I just note the discrepancy and think it would be good to agree.)

lars-t-hansen avatar Mar 01 '16 16:03 lars-t-hansen

Note that ToNumber() will throw a TypeError if passed a Symbol or SIMD value.

It looks like we have several options:

  1. Require that Type(index) is Number, throw a TypeError if not.
  2. Coerce with ToNumber(index), throw a RangeError if the coerced value is not an integer or out of bounds. This is what DataView and the ArrayBuffer constructor does.
  3. Coerce with ToPropertyKey() and CanonicalNumericIndexString(), throw a RangeError if the coerced value is not an integer or out of bounds. This is what an "Integer Indexed exotic object" and the shared memory spec does.

The two coercion methods produce different exception types when passed a Symbol or SIMD value.

In my opinion, we should not throw a TypeError when Type(index) is Number. A RangeError is more appropriate in that case.

stoklund avatar Mar 02 '16 00:03 stoklund

I think 2) above makes the most sense. Throwing a TypeError when the value cannot be coerced to a Number seems like the right thing to do, and there's precedence for that with DataView.

PeterJensen avatar Mar 02 '16 00:03 PeterJensen

@lars-t-hansen, would 2) above make sense for the shared memory spec? Something like this:

  1. Let numberIndex be ToNumber(requestIndex).
  2. Let getIndex be ToInteger(numberIndex).
  3. If numberIndex ≠ getIndex or getIndex < 0, throw a RangeError exception.

See also https://tc39.github.io/ecma262/#sec-typedarray-length.

stoklund avatar Mar 02 '16 01:03 stoklund

That would probably not make sense, because it allows many strings that are not appropriate indexed property names to be interpreted as property names, which is not what we want. I assume this is why the machinery with CanonicalNumericIndexString was introduced in the first place.

Consider the property name "5e3". ToNumber converts this to 5000, and it passes. But it is not canonical because 5000 does not convert back into "5e3". Ditto names with leading and trailing blanks, and even the empty string, not to mention hex, octal, and infinities. ToNumber is very lenient.

ToNumber turns "-0" correctly into -0, but that should also be handled properly by the range checking, there should be a RangeError.

Are there technical reasons why the SIMD spec could not use the ToPropertyKey / CanonicalNumericIndexedString machinery? In the event of an int32 input there's no overhead there, it's only annoying for exotic inputs, but those would bail out from jitted code anyway. (As far as Firefox is concerned all the machinery is already packaged up and ready to use, see eg https://dxr.mozilla.org/mozilla-central/source/js/src/builtin/AtomicsObject.cpp#109.)

Despite asserting earlier that I am not claiming to have the right answer, I'm finding myself drifting toward the position that I have the right answer :) At least I think that predictably distinguishing strings that could be interpreted as numbers or non-numbers is a hard requirement; ES has always done that properly.

lars-t-hansen avatar Mar 02 '16 11:03 lars-t-hansen

One more note: in case somebody missed it because of the strange name, integer indexed exotic objects (IIEOs) aren't actually all that exotic. Instances of Array are not IIEOs but they use the same definition as CanonicalNumericIndexedString to recognize numeric indexed values, the prose around this is just poorly written (https://tc39.github.io/ecma262/#sec-object-type, third paragraph, note the link). TypedArrays are IIEOs, of course.

I think that what the shared memory spec does now is just standard index laundering -- at least it aims to be standard. I think the SIMD spec should probably follow suit.

lars-t-hansen avatar Mar 02 '16 11:03 lars-t-hansen

Thanks, @lars-t-hansen

I agree with your performance argument. No matter what we do, integer indexes will be fast, and everything else will bail out to the slow path. It doesn't matter performance-wise if the slow path throws unconditionally, or if it goes through various string conversions.

We should do what makes sense for the JavaScript language. @littledan, @jswalden, any opinions?

stoklund avatar Mar 02 '16 16:03 stoklund

I think we should be consistent with the rest of the ECMAScript spec here. For property accesses, like TypedArrays have, that means that things are converted to strings, and 5e3 would not get property 5000. For functions, that means calling ToNumber. The ECMAScript spec mentions this strict checking against non-integers, but I'm not sure if browsers implement that, so it's a wild card, but there definitely aren't many places in the spec that expect Numbers and throw otherwise. This came up in a couple TC39 meetings where it was discussed and agreed on that we need to cast to Numbers and we cannot throw if passed a String.

So this does look like a bug in the spec text. Option 2 looks like the right fix. The SharedArrayBuffer spec should probably follow suit in casting to Number through one conversion function or another

littledan avatar Mar 02 '16 21:03 littledan

cc @binji

My objection is that ta[n] and Atomics.load(ta, n) will treat n differently, which is a real shame, especially since all sorts of confused representations of n can be used in the latter case; after all the intent of the latter is not to be a function call but to be the application of a built-in operator using function-call syntax.

I'll look at the ES spec some more -- I'm sympathetic to the distinction that is being made here -- and get back to you this week. I don't think making a change is going to be a big problem for the shmem spec or the existing implementations, so it's not a practical consideration whether to agree or oppose.

lars-t-hansen avatar Mar 03 '16 09:03 lars-t-hansen

OK, considering what the DataView operations do, this seems like a reasonable change. I'll file a bug against the shared memory spec.

lars-t-hansen avatar Mar 03 '16 17:03 lars-t-hansen

My heart is with requiring Type(index) be Number, but I suppose that no-coercion ship sailed twenty years ago. :-( Consistency with DataView et al. seems best to me. Consistency with ToPropertyKey seems inapposite. The arguments to this function are not property keys; they should not be expected to work that way.

jswalden avatar Mar 05 '16 00:03 jswalden