proposal-record-tuple
proposal-record-tuple copied to clipboard
Status of value semantics?
According to @littledan on JS Party Episode 305, value semantics for records and tuples are being reconsidered by the committee. Could you expand a little more on the concerns here?
Hi @AprilArcus, I'm hoping something more complete can be written up on the status of this soon.
In lieu of that, a short summary is that we have received feedback from multiple major JS engine implementation teams that their investigations around ===
equality for R&T suggests this will be more complicated than we initially hoped and even if implemented there is doubt that the performance of the operation would match developer expectations. Additionally due to the way most JS engines are architected it would add an extra load instruction to existing usages of ===
, so existing apps that haven't adopted R&T could still be negatively impacted.
This was not the only feedback but is maybe the most significant in terms of the proposal now looking at other potential designs for how the equality could work.
I'll be sure to post an update here when there is a link to more information.
@acutmore thanks for the update! does this preclude different R&T instances with equivalent values being SameValue
equals? my immediate expectation would be that if strict equality isn’t possible, that they could be compared using Object.is
.
What about using these types as keys in Map
and Set
? Using a tuple such as [x, y]
as key currently is a bit awkward since you have to convert it to a primitive first. Being able to represent composite key more easily was a great benefit of this proposal.
does this preclude different R&T instances with equivalent values being SameValue equals? my immediate expectation would be that if strict equality isn’t possible, that they could be compared using Object.is.
@acusti mostly like it does preclude that. SameValue
and Object.is
are very exact comparisons. Ignoring NaN canonicalisation, when two values are considered "same" then there is no way to tell them apart. They are literally the same value from the language's perspective.
That said there could be a new API Something.equals
that would be a new form of equality that includes checking R&T.
@demurgos, similar to above. It's likely that by default R&T would behave the same as any other object in Map
and Set
, but there can be either new collection types or a flag that is passed to the constructor which provides a type of Map
or Set
which would allow R&T to be used as a more descriptive key.
Thank you for your reply. It feels like R&T will end up being syntax sugar for Object.freeze
(and no prototype). If so, it will still be useful but it's disappointing that perf concerns caused such a scope reduction :frowning_face:
@demurgos the R&T champions and other delegates are still very much interested in supporting the use cases that R&T would solve. For example, composite keys as you mention is definitely something I would like to be better supported natively as it's extremely painful to build in userland. However for various compatibility and performance reasons it likely will require some kind of explicit opt-in.
If ===
is too performance-sensitive to be extended to cover a special case for Records & Tuples, could a useful set of semantics be reclaimed from the already slower and more complicated ==
operator? Or could a novel operator be introduced such as =#=
?
Hi @AprilArcus, are there particular situations where you would find having an operator useful over having to use a function call?
Adding a new overload to ==
is, I feel, also unlikely. While it does have slow paths, comparing two objects is not one of them. There is also a general theme in the ecosystem to prefer ===
over ==
(though == null
is sometimes seen as an OK exception), so adding new functionality to it might muddy the water in terms of advice on which to use.
Adding a new operator, like any new syntax, tends to need a stronger justification compared to adding new API. This is likely because the remaining design space is somewhat limited (only so many non-letter ascii characters to choose from), and they can be harder to look up their documentation (IDEs currently tend to expose docs for APIs but not syntax). A new operator could also be researched as a separate follow on proposal.
If the problem (interning) already exists with strings, (does it?) why is it not acceptable for R&T? Shouldn't the problem be addressed in the same way? I.e. instead of introducing an e.g. RnT.sometimesSlowCompare()
, have ===
be the sometimes-slow one -- like it already is for strings -- and introduce a constantTimeCompare()
function for interned strings & R&T?
I'm sure there's something(s) I'm missing though because I didn't follow the bit about why that would "add an extra load instruction to existing usages of ===
".
I'd rather not have R&T at all than have them without value comparison semantics.
Hi @gregmartyn, I'll try and respond as best I can!
If the problem (interning) already exists with strings,
You are correct that comparing string with ===
has a similar risk of also being a 'slow' (non-constant-time) comparison. The concerns are a mix of:
- strings already having this risk does not necessarily make it OK to increase the number of types with this risk
- applying the same string lazy-interning optimisations to R&T makes it much-harder for R&T to re-use the existing optimisations that exist for Objects and Arrays - as the memory layouts won't match.
I didn't follow the bit about why that would "add an extra load instruction to existing usages of ===".
This does not apply to all JS engines, it depends on the information that comes with their 'tagged-pointers'. In some engines the pointer itself encodes that it points to a JS-object or a JS-string. This means that the instructions for ===
have a fast path where if the two pointers point to JS-objects they can immediately return the comparison of the two addresses. If R&T are built re-using most of the existing mechanisms for JS-objects, so they can befit from the existing optimisations and GC, then this means that whenever two objects are compared with ===
the instructions now lose that fast path and will need to load/follow the two addresses to further inspect if the two objects are regular objects or R/T to know if they can be compared by address or by content. This extra load instruction while small on it's own could add observable overhead when used in a hot-path of an application.
What about using these types as keys in Map and Set? Using a tuple such as [x, y] as key currently is a bit awkward since you have to convert it to a primitive first. Being able to represent composite key more easily was a great benefit of this proposal.
I'd rather not have R&T at all than have them without value comparison semantics.
@demurgos
Myself, I'd still prefer this to go forward, even without them.
fwiw, even though I agree it'd be ideal to have strict equality, a Map[Symbol.keyEquality]
would more or less address this issue. Less than ideal since it moves the problem to userland and will probably have less performance, but it could work. Or a new, nativeRecordTupleMap
that implements a new key equality algorithm, provided by js engines, which would be more performant.
None of these options are ideal, but moving forward with R&Ts without strict equality doesn't close these doors, and decoupling it from this requirement seems to be an acceptable path forward.
A future, separate proposal could address ===
comparing by value rather than reference for records and tuples, which may have much less friction if records and tuples are already implemented in engines, or at least the proposal is in stage 3 or 4.
Also, I'm not sure changing ===
would affect Same-value-zero equality — isn't that an internal engine-specific algorithm that is not necessarily using ===
internally? If so, even if this proposal did change how this operator works, that might still not affect how Maps and Sets compare keys. These problems might be decoupled from each other — proposing that the Same-value-zero equality compares records and tuples by value regardless of what ===
, Object.is
, etc do, might be a more addressable proposal.
A future proposal would never ever be capable of changing how any operator behaves with a specific kind of object. Once R&T ships as a non-primitive, that is forever an unavailable path for them.
Myself, I'd still prefer this to go forward, even without them.
If we take a step back, what is even the goal that this proposal is trying to achieve? Is it a new feature or new syntax? In a reply above I was disappointed in the scope reduction but felt the sugar may still be useful. Since then I had some time to think more about it, and I don't want to give up the value semantics. Please keep them, they are the core of this proposal.
The first paragraphs of the README are (emphasis mine):
Records and Tuples can only contain primitives and other Records and Tuples. You could think of Records and Tuples as "compound primitives". By being thoroughly based on primitives, not objects, Records and Tuples are deeply immutable.
Records and Tuples support comfortable idioms for construction, manipulation and use, similar to working with objects and Arrays. They are compared deeply by their contents, rather than by their identity.
The expectation from the start was that the goal was to get compound primitives, aka value semantics. These are an incredibly powerful feature because they fully integrate with regular operators and methods. ===
, Object.is
, Array#indexOf
, etc. just work and it allows to avoid round-tripping to a string serialization. Compound values become regular values with R&T.
I don't want to be too harsh for all the people working on this who want to drop value semantics, but it's also one of the most important ongoing proposals for JS. Syntax sugar can be great, for example the class
keyword did not unlock new features initially but made an awful lot of patterns more convenient. Without value semantics, #[]
and #{}
could be an Object.deepFreeze
method, saving a bit of keystrokes. But is it really what the proposal intended to achieve?
I understand that compound primitives are harder to support, but it may lead to a way better language in the long run. The JS module system is one of the best of any language I ever used, it required huge amounts of effort but the result was well worth it. Native async support was also great and deeply transformed JS for the better. R&T with value semantics is one of those landmark proposals that should be fully realized IMO. JS does not have stuff like operator overloading, so R&T is the best shot to get first class language support for composite values without stringifying the world. Compound primitives are also a solid foundation to expand on in the future; so it's better to take a bit more time and have the best version of them that we can.
The most common case I run into where I feel like records and tuples would be a godsend is when I am trying to use structured data as a map key, without having to encode it as string or resorting to any other hacks. However, this doesn't work at all if records / tuples are just frozen objects instead of primitives. Although I appreciate the technical difficulty of adding them, I feel like it isn't worth it without this key feature.
That use case - being able to control the "key" of Maps and Sets - is valuable to solve, and doesn't need R&T to do so.
That use case - being able to control the "key" of Maps and Sets - is valuable to solve, and doesn't need R&T to do so.
Whilst you are right that R&T is not needed for this use case, it feels natural to use an immutable, compared by value data structure for a Map key, doesn't it? It is simply my opinion that the real value of R&T isn't so much the immutability as the possibilities that are opened by a complex data primitive.
Immutability is a requirement for any stable keying (it's also the reason the value must not be key-comparable before being stable). What is not automatically needed is for the composite key value to expose the values composing it.
The original R&T proposal solved this by making values born deeply immutable, which made it itself usable as a composite key. What the champions are now exploring is the problem space if the immutable R/T and the composite key are not the same thing.
Having key semantics different from equality leads to many inconsistencies. It's possible to build many examples, but they all boil down to expecting that a Map/Set having a key implies that there exists an item equal to the key. In particular, I expect all three functions below to be equivalent*.
function setHas1(s: Set<unknown>, x: unknown): boolean {
return s.has(x)
}
function setHas2(s: Set<unknown>, x: unknown): boolean {
for (const v of s) {
if (v === x) {
return true;
}
return false;
}
}
function setHas3(s: Set<unknown>, x: unknown): boolean {
return [...s].indexOf(x) >= 0;
}
The strength of value semantics is that it's not a special case for Map
and Set
, but that it fully integrates with the language at all levels.
*: Yes, I'm aware of NaN
, it does not mean that we should add more edge cases
@demurgos thats not an assumption that holds for subclasses, so it’s a false assumption. Only the first is valid.
boil down to expecting that a Map/Set having a key implies that there exists an item equal to the key
I don't think anyone has suggested changing the default key semantics of Map/Set. However one line being explored is that Map/Set could be configured at construction with a keying function to derive the keying value from the provided "key" (This is roughly equivalent to an alternative Map/Set implementation). The keying value would have normal equality / SameValue semantics.
I expect all three functions below to be equivalent*.
I don't believe a program can assume the behavior of dynamically invoked methods on parameters. Anyone today can already provide a custom map/set/array instance with overriden methods.
The program can only trust the behavior of instances it itself creates (modulo prototype pollution which is an orthogonal concern). Since noone is proposing to change the default behavior of existing Map/Set, I don't believe there is a problem.
I understand that JS is dynamic by nature, everything can be patched, etc. But I would still argue that any subclass or method override where iteration disagrees with membership no longer provides a consistent Set
interface.
JS already has multiple ways to check for equality Object.is
, ==
, ===
; so I get why parametrizing a map/set so it uses a different behavior is a reasonable idea. It's actually a good idea, but I don't think it's the best for R&T. I'll wait to see what this exploration will lead to.
The keying function idea seems a bit awkward to me, because it's possible that the keying function will return different values for the same original key over time (eg, k => JSON.stringify(k)
will vary if someone modifies the original k
object, or an object referred to by it, etc).
This means that either the mapped key needs to be stored indefinitely in the Map
, or else the Map
would be able to get into an undefined state due to inconsistencies in derived key hashing (eg, if the implementation uses a hash map or similar) or key ordering (eg, if the implementation uses a tree map or similar).
const map = new Map(k => JSON.stringify(k));
const k = { foo: 42 };
map.set(k, "hello");
k.foo++;
map.set(k, "world");
Presumably the Map
above now contains two entries with the same original key value k
, but if it's backed by a hash map and the mapped values ("{\"foo\":42}"
, "{\"foo\":43}"
) are not stored, there's a chance that both set
operations hash to the same bucket so it only ends up with one entry, thus exposing behaviour of the underlying hashing/bucket.
I can imagine there might be some use for it, but I think if it becomes the main way to implement "compound keys", it could be a big footgun.
Having a separate constructor for compound keys just seems more sound, but it does indeed just sound like a less useful version of tuples (since they're basically just tuples that can't be read).
The keying function idea seems a bit awkward to me, because it's possible that the keying function will return different values for the same original key over time (eg,
k => JSON.stringify(k)
will vary if someone modifies the originalk
object, or an object referred to by it, etc).
It's the responsibility of the creator of the Map/Set to use a keying function that's stable. Relying on the data in the key itself with the key being mutable is obviously not compatible with stability, hence why I mentioned that immutability is a requirement for any stable keying
the mapped key needs to be stored indefinitely in the
Map
Yes, the Map implementation would not rely on the keying function for existing entries, not just because of stability, but because otherwise it would expose the internal implementation of the Map to the keying function.
Having a separate constructor for compound keys just seems more sound
Right, a composite key is a useful building block for custom collection keying: the keyBy options can just return a composite key. I personally believe it actually should be the only accepted return values of keyBy, to avoid potential footguns like you describe.
I personally believe it actually should be the only accepted return values of keyBy, to avoid potential footguns like you describe.
How does that avoid the footgun? Presumably k => Map.key(k.foo)
isn't any more stable than k => JSON.stringify(k)
in the example I gave. Even if it returns an object, I would expect the underlying equality (and hash) operation to continue to be stable (eg, SameValue
or SameValueZero
isn't going to give different results for the same pair of values, even if the values are mutable objects).
My expectation is that a keying function would only ever be called once for any given value, and the result memoized.
While a dedicated key creating API does not prevent mutated objects being unstable sources of key generation, it does still have benefits compared to JSON.stringify
- it could help avoid worrying about key order (JSON.stringify({ a: 1, b: 2 }) !== JSON.stringify({ b: 2, a: 1 })
), and support more values than JSON (JSON.stringify([1n]) // throws
).
While the primitive, value semantics, R&T design guarantees stability - the same indirection issues can arise because R&T could only store other primitives. So indirection (via symbols in weakMaps) was required to store non-primitive values (e.g. Temporal.ZonedDateTime
), just like someone can provide a non-stable keyby
function to a Map
or Set
, someone could have a non-stable lookup for the objects referenced by their R&T.
So to me the question is less of, is it possible to create an odd scenario with the API, and more of if the APIs encourage good patterns by default and does less safe usages of the API clearly standout when reading the code.
My expectation is that a keying function would only ever be called once for any given value, and the result memoized.
If that's taken literally, it means it would need to remember multiple original key values, eg:
const map = new Map(k => {
console.log("called");
return JSON.stringify(k);
});
const k0 = { foo: 42 };
const k1 = { foo: 42 };
map.set(k0, "hello"); // "called"
map.set(k1, "world"); // "called"
map.get(k0); // ??
map.get(k1); // ??
Either both distinct key objects (k0
and k1
) would need to be stored in the map, or at least one of the get
operations will need to call the keying function again.
The former sounds very weird to me, since repeated sets of the "same" key will grow the map. Perhaps the implementation can mitigate this by having an implicit weak map associated (so temporary objects at least are not maintained), but this seems like a lot of undesirable overhead.
yes, i'd expect k0
and k1
to both be stored somewhere, mapping to the json stringification of the key. Object keys would indeed need to be held weakly in the "somewhere".
I'm just a JS developer, and not part of the standards body, so this is just my two cents, but my preference would be to either retain the value semantics or abandon the entire proposal.
Records and tuples with value semantics would be transformative. As I'm sure everyone posting here already knows, many popular libraries and frameworks in the JavaScript ecosystem (React, Redux, RxJS, etc) are "reactive" in the sense that they involve computing the difference between a new state and a prior state and taking some action based on the changes. Any programmer who works professionally with these libraries always needs to be aware of the reference semantics of objects and arrays; carelessly replacing one object or array with another, identical object or array can trigger a large amount of unnecessary downstream work or even an infinite loop. Records and tuples with value semantics would make it much easier to write correct, performant code in these situations.
Record and tuples without value semantics, by contrast, don't really help with this problem at all. It doesn't seem to me that they would offer much value in terms of solving my day-to-day problems programming in the JS ecosystem. Since there's a limited amount of nice syntax available in the language, if value semantics are not on offer, my preference would be to save the syntax for something more impactful.