ecma262 icon indicating copy to clipboard operation
ecma262 copied to clipboard

String operations should be documented and recognized as fallible

Open gibson042 opened this issue 3 years ago • 3 comments

cf. https://github.com/tc39/ecma402/pull/625/files#r784355422

ECMAScript String values are bounded to a maximum of 253 - 1 code units, but AFAICT no such limitation applies to specification types such as Lists or lexical input elements. As a result, it is possible to create internal values that cannot be represented as strings, but the specification defines no error when encountering situations that require doing just that. Some examples:

  • "x".repeat(2**53)
  • "ü".repeat(2**53 - 1).normalize("NFD")
  • "x".repeat(2**53 - 1).replace("x", "~~")
  • "ß".repeat(2**53 - 1).toLocaleUpperCase("en")
  • Function("//" + "x".repeat(2**53 - 3)).toString()
  • RegExp("x".repeat(2**53 - 1)).toString()
  • "x".repeat(2**53 - 1) + "!"
  • { printf 'let '; head -c 9007199254740992 /dev/zero | tr '\0' 'x'; } | eshost /dev/stdin (evaluation of LexicalBinding : BindingIdentifier invokes StringValue of BindingIdentifier, which calls CodePointsToString on its code points)
  • { head -c 9007199254740992 /dev/zero | tr '\0' 'x'; printf ': 0'; } | eshost /dev/stdin (evaluation of LabelledStatement : LabelIdentifier : LabelledItem invokes StringValue of LabelIdentifier, which calls CodePointsToString on its code points)

It seems like either string-concatenation, CodePointsToString, and other operations that concatenate strings should explicitly check bounds and throw a RangeError when they are exceeded, or some general text in The String Type should document that it always applies.

The good news is a lack of urgency here, because AFAICT every current implementation is incapable of dealing with anything near the limit.

gibson042 avatar Jan 13 '22 23:01 gibson042

Perhaps relevant, https://github.com/tc39/ecma262/pull/641

ljharb avatar Jan 13 '22 23:01 ljharb

Yup, that's definitely wrong. Editors share your lack of urgency about this. That said, documenting the fact that these and other operations can throw because of real-world resources limitations probably merits attention before dealing with the spec-imposed limitations.

bakkot avatar Jan 19 '22 23:01 bakkot

String-concatenation etc are invoked both statically and dynamically, so in addition to throwing at 'runtime', you'd probably want early errors (or handwaving in that direction) for tokens that are too long.

jmdyck avatar Sep 15 '25 14:09 jmdyck