proposal-record-tuple icon indicating copy to clipboard operation
proposal-record-tuple copied to clipboard

Status of value semantics?

Open AprilArcus opened this issue 1 year ago • 57 comments

According to @littledan on JS Party Episode 305, value semantics for records and tuples are being reconsidered by the committee. Could you expand a little more on the concerns here?

AprilArcus avatar Dec 24 '23 01:12 AprilArcus

Hi @AprilArcus, I'm hoping something more complete can be written up on the status of this soon.

In lieu of that, a short summary is that we have received feedback from multiple major JS engine implementation teams that their investigations around === equality for R&T suggests this will be more complicated than we initially hoped and even if implemented there is doubt that the performance of the operation would match developer expectations. Additionally due to the way most JS engines are architected it would add an extra load instruction to existing usages of ===, so existing apps that haven't adopted R&T could still be negatively impacted.

This was not the only feedback but is maybe the most significant in terms of the proposal now looking at other potential designs for how the equality could work.

I'll be sure to post an update here when there is a link to more information.

acutmore avatar Jan 08 '24 18:01 acutmore

@acutmore thanks for the update! does this preclude different R&T instances with equivalent values being SameValue equals? my immediate expectation would be that if strict equality isn’t possible, that they could be compared using Object.is.

acusti avatar Jan 11 '24 06:01 acusti

What about using these types as keys in Map and Set? Using a tuple such as [x, y] as key currently is a bit awkward since you have to convert it to a primitive first. Being able to represent composite key more easily was a great benefit of this proposal.

demurgos avatar Jan 15 '24 16:01 demurgos

does this preclude different R&T instances with equivalent values being SameValue equals? my immediate expectation would be that if strict equality isn’t possible, that they could be compared using Object.is.

@acusti mostly like it does preclude that. SameValue and Object.is are very exact comparisons. Ignoring NaN canonicalisation, when two values are considered "same" then there is no way to tell them apart. They are literally the same value from the language's perspective.

That said there could be a new API Something.equals that would be a new form of equality that includes checking R&T.

@demurgos, similar to above. It's likely that by default R&T would behave the same as any other object in Map and Set, but there can be either new collection types or a flag that is passed to the constructor which provides a type of Map or Set which would allow R&T to be used as a more descriptive key.

acutmore avatar Jan 15 '24 18:01 acutmore

Thank you for your reply. It feels like R&T will end up being syntax sugar for Object.freeze (and no prototype). If so, it will still be useful but it's disappointing that perf concerns caused such a scope reduction :frowning_face:

demurgos avatar Jan 17 '24 01:01 demurgos

@demurgos the R&T champions and other delegates are still very much interested in supporting the use cases that R&T would solve. For example, composite keys as you mention is definitely something I would like to be better supported natively as it's extremely painful to build in userland. However for various compatibility and performance reasons it likely will require some kind of explicit opt-in.

mhofman avatar Jan 17 '24 03:01 mhofman

If === is too performance-sensitive to be extended to cover a special case for Records & Tuples, could a useful set of semantics be reclaimed from the already slower and more complicated == operator? Or could a novel operator be introduced such as =#=?

AprilArcus avatar Jan 17 '24 05:01 AprilArcus

Hi @AprilArcus, are there particular situations where you would find having an operator useful over having to use a function call?

Adding a new overload to == is, I feel, also unlikely. While it does have slow paths, comparing two objects is not one of them. There is also a general theme in the ecosystem to prefer === over == (though == null is sometimes seen as an OK exception), so adding new functionality to it might muddy the water in terms of advice on which to use.

Adding a new operator, like any new syntax, tends to need a stronger justification compared to adding new API. This is likely because the remaining design space is somewhat limited (only so many non-letter ascii characters to choose from), and they can be harder to look up their documentation (IDEs currently tend to expose docs for APIs but not syntax). A new operator could also be researched as a separate follow on proposal.

acutmore avatar Jan 17 '24 09:01 acutmore

If the problem (interning) already exists with strings, (does it?) why is it not acceptable for R&T? Shouldn't the problem be addressed in the same way? I.e. instead of introducing an e.g. RnT.sometimesSlowCompare(), have === be the sometimes-slow one -- like it already is for strings -- and introduce a constantTimeCompare() function for interned strings & R&T?

I'm sure there's something(s) I'm missing though because I didn't follow the bit about why that would "add an extra load instruction to existing usages of ===".

gregmartyn avatar Jan 17 '24 20:01 gregmartyn

I'd rather not have R&T at all than have them without value comparison semantics.

errorx666 avatar Jan 18 '24 15:01 errorx666

Hi @gregmartyn, I'll try and respond as best I can!

If the problem (interning) already exists with strings,

You are correct that comparing string with === has a similar risk of also being a 'slow' (non-constant-time) comparison. The concerns are a mix of:

  • strings already having this risk does not necessarily make it OK to increase the number of types with this risk
  • applying the same string lazy-interning optimisations to R&T makes it much-harder for R&T to re-use the existing optimisations that exist for Objects and Arrays - as the memory layouts won't match.

I didn't follow the bit about why that would "add an extra load instruction to existing usages of ===".

This does not apply to all JS engines, it depends on the information that comes with their 'tagged-pointers'. In some engines the pointer itself encodes that it points to a JS-object or a JS-string. This means that the instructions for === have a fast path where if the two pointers point to JS-objects they can immediately return the comparison of the two addresses. If R&T are built re-using most of the existing mechanisms for JS-objects, so they can befit from the existing optimisations and GC, then this means that whenever two objects are compared with === the instructions now lose that fast path and will need to load/follow the two addresses to further inspect if the two objects are regular objects or R/T to know if they can be compared by address or by content. This extra load instruction while small on it's own could add observable overhead when used in a hot-path of an application.

acutmore avatar Jan 22 '24 15:01 acutmore

What about using these types as keys in Map and Set? Using a tuple such as [x, y] as key currently is a bit awkward since you have to convert it to a primitive first. Being able to represent composite key more easily was a great benefit of this proposal.

I'd rather not have R&T at all than have them without value comparison semantics.

@demurgos

Myself, I'd still prefer this to go forward, even without them.

fwiw, even though I agree it'd be ideal to have strict equality, a Map[Symbol.keyEquality] would more or less address this issue. Less than ideal since it moves the problem to userland and will probably have less performance, but it could work. Or a new, nativeRecordTupleMap that implements a new key equality algorithm, provided by js engines, which would be more performant.

None of these options are ideal, but moving forward with R&Ts without strict equality doesn't close these doors, and decoupling it from this requirement seems to be an acceptable path forward.

A future, separate proposal could address === comparing by value rather than reference for records and tuples, which may have much less friction if records and tuples are already implemented in engines, or at least the proposal is in stage 3 or 4.

Also, I'm not sure changing === would affect Same-value-zero equality — isn't that an internal engine-specific algorithm that is not necessarily using === internally? If so, even if this proposal did change how this operator works, that might still not affect how Maps and Sets compare keys. These problems might be decoupled from each other — proposing that the Same-value-zero equality compares records and tuples by value regardless of what ===, Object.is, etc do, might be a more addressable proposal.

lautarodragan avatar Feb 02 '24 15:02 lautarodragan

A future proposal would never ever be capable of changing how any operator behaves with a specific kind of object. Once R&T ships as a non-primitive, that is forever an unavailable path for them.

ljharb avatar Feb 02 '24 16:02 ljharb

Myself, I'd still prefer this to go forward, even without them.

If we take a step back, what is even the goal that this proposal is trying to achieve? Is it a new feature or new syntax? In a reply above I was disappointed in the scope reduction but felt the sugar may still be useful. Since then I had some time to think more about it, and I don't want to give up the value semantics. Please keep them, they are the core of this proposal.

The first paragraphs of the README are (emphasis mine):

Records and Tuples can only contain primitives and other Records and Tuples. You could think of Records and Tuples as "compound primitives". By being thoroughly based on primitives, not objects, Records and Tuples are deeply immutable.

Records and Tuples support comfortable idioms for construction, manipulation and use, similar to working with objects and Arrays. They are compared deeply by their contents, rather than by their identity.

The expectation from the start was that the goal was to get compound primitives, aka value semantics. These are an incredibly powerful feature because they fully integrate with regular operators and methods. ===, Object.is, Array#indexOf, etc. just work and it allows to avoid round-tripping to a string serialization. Compound values become regular values with R&T.

I don't want to be too harsh for all the people working on this who want to drop value semantics, but it's also one of the most important ongoing proposals for JS. Syntax sugar can be great, for example the class keyword did not unlock new features initially but made an awful lot of patterns more convenient. Without value semantics, #[] and #{} could be an Object.deepFreeze method, saving a bit of keystrokes. But is it really what the proposal intended to achieve?

I understand that compound primitives are harder to support, but it may lead to a way better language in the long run. The JS module system is one of the best of any language I ever used, it required huge amounts of effort but the result was well worth it. Native async support was also great and deeply transformed JS for the better. R&T with value semantics is one of those landmark proposals that should be fully realized IMO. JS does not have stuff like operator overloading, so R&T is the best shot to get first class language support for composite values without stringifying the world. Compound primitives are also a solid foundation to expand on in the future; so it's better to take a bit more time and have the best version of them that we can.

demurgos avatar Feb 19 '24 23:02 demurgos

The most common case I run into where I feel like records and tuples would be a godsend is when I am trying to use structured data as a map key, without having to encode it as string or resorting to any other hacks. However, this doesn't work at all if records / tuples are just frozen objects instead of primitives. Although I appreciate the technical difficulty of adding them, I feel like it isn't worth it without this key feature.

Samuel-Martineau avatar Feb 20 '24 02:02 Samuel-Martineau

That use case - being able to control the "key" of Maps and Sets - is valuable to solve, and doesn't need R&T to do so.

ljharb avatar Feb 20 '24 02:02 ljharb

That use case - being able to control the "key" of Maps and Sets - is valuable to solve, and doesn't need R&T to do so.

Whilst you are right that R&T is not needed for this use case, it feels natural to use an immutable, compared by value data structure for a Map key, doesn't it? It is simply my opinion that the real value of R&T isn't so much the immutability as the possibilities that are opened by a complex data primitive.

Samuel-Martineau avatar Feb 20 '24 04:02 Samuel-Martineau

Immutability is a requirement for any stable keying (it's also the reason the value must not be key-comparable before being stable). What is not automatically needed is for the composite key value to expose the values composing it.

The original R&T proposal solved this by making values born deeply immutable, which made it itself usable as a composite key. What the champions are now exploring is the problem space if the immutable R/T and the composite key are not the same thing.

mhofman avatar Feb 20 '24 05:02 mhofman

Having key semantics different from equality leads to many inconsistencies. It's possible to build many examples, but they all boil down to expecting that a Map/Set having a key implies that there exists an item equal to the key. In particular, I expect all three functions below to be equivalent*.

function setHas1(s: Set<unknown>, x: unknown): boolean {
  return s.has(x)
}

function setHas2(s: Set<unknown>, x: unknown): boolean {
 for (const v of s) {
    if (v === x) {
      return true;
    }
    return false;
  }
}

function setHas3(s: Set<unknown>, x: unknown): boolean {
  return [...s].indexOf(x) >= 0;
}

The strength of value semantics is that it's not a special case for Map and Set, but that it fully integrates with the language at all levels.

*: Yes, I'm aware of NaN, it does not mean that we should add more edge cases

demurgos avatar Feb 20 '24 12:02 demurgos

@demurgos thats not an assumption that holds for subclasses, so it’s a false assumption. Only the first is valid.

ljharb avatar Feb 20 '24 14:02 ljharb

boil down to expecting that a Map/Set having a key implies that there exists an item equal to the key

I don't think anyone has suggested changing the default key semantics of Map/Set. However one line being explored is that Map/Set could be configured at construction with a keying function to derive the keying value from the provided "key" (This is roughly equivalent to an alternative Map/Set implementation). The keying value would have normal equality / SameValue semantics.

I expect all three functions below to be equivalent*.

I don't believe a program can assume the behavior of dynamically invoked methods on parameters. Anyone today can already provide a custom map/set/array instance with overriden methods.

The program can only trust the behavior of instances it itself creates (modulo prototype pollution which is an orthogonal concern). Since noone is proposing to change the default behavior of existing Map/Set, I don't believe there is a problem.

mhofman avatar Feb 20 '24 14:02 mhofman

I understand that JS is dynamic by nature, everything can be patched, etc. But I would still argue that any subclass or method override where iteration disagrees with membership no longer provides a consistent Set interface.

JS already has multiple ways to check for equality Object.is, ==, ===; so I get why parametrizing a map/set so it uses a different behavior is a reasonable idea. It's actually a good idea, but I don't think it's the best for R&T. I'll wait to see what this exploration will lead to.

demurgos avatar Feb 20 '24 15:02 demurgos

The keying function idea seems a bit awkward to me, because it's possible that the keying function will return different values for the same original key over time (eg, k => JSON.stringify(k) will vary if someone modifies the original k object, or an object referred to by it, etc).

This means that either the mapped key needs to be stored indefinitely in the Map, or else the Map would be able to get into an undefined state due to inconsistencies in derived key hashing (eg, if the implementation uses a hash map or similar) or key ordering (eg, if the implementation uses a tree map or similar).

const map = new Map(k => JSON.stringify(k));
const k = { foo: 42 };
map.set(k, "hello");
k.foo++;
map.set(k, "world");

Presumably the Map above now contains two entries with the same original key value k, but if it's backed by a hash map and the mapped values ("{\"foo\":42}", "{\"foo\":43}") are not stored, there's a chance that both set operations hash to the same bucket so it only ends up with one entry, thus exposing behaviour of the underlying hashing/bucket.

I can imagine there might be some use for it, but I think if it becomes the main way to implement "compound keys", it could be a big footgun.

Having a separate constructor for compound keys just seems more sound, but it does indeed just sound like a less useful version of tuples (since they're basically just tuples that can't be read).

Maxdamantus avatar Feb 20 '24 20:02 Maxdamantus

The keying function idea seems a bit awkward to me, because it's possible that the keying function will return different values for the same original key over time (eg, k => JSON.stringify(k) will vary if someone modifies the original k object, or an object referred to by it, etc).

It's the responsibility of the creator of the Map/Set to use a keying function that's stable. Relying on the data in the key itself with the key being mutable is obviously not compatible with stability, hence why I mentioned that immutability is a requirement for any stable keying

the mapped key needs to be stored indefinitely in the Map

Yes, the Map implementation would not rely on the keying function for existing entries, not just because of stability, but because otherwise it would expose the internal implementation of the Map to the keying function.

Having a separate constructor for compound keys just seems more sound

Right, a composite key is a useful building block for custom collection keying: the keyBy options can just return a composite key. I personally believe it actually should be the only accepted return values of keyBy, to avoid potential footguns like you describe.

mhofman avatar Feb 20 '24 20:02 mhofman

I personally believe it actually should be the only accepted return values of keyBy, to avoid potential footguns like you describe.

How does that avoid the footgun? Presumably k => Map.key(k.foo) isn't any more stable than k => JSON.stringify(k) in the example I gave. Even if it returns an object, I would expect the underlying equality (and hash) operation to continue to be stable (eg, SameValue or SameValueZero isn't going to give different results for the same pair of values, even if the values are mutable objects).

Maxdamantus avatar Feb 20 '24 20:02 Maxdamantus

My expectation is that a keying function would only ever be called once for any given value, and the result memoized.

ljharb avatar Feb 20 '24 21:02 ljharb

While a dedicated key creating API does not prevent mutated objects being unstable sources of key generation, it does still have benefits compared to JSON.stringify - it could help avoid worrying about key order (JSON.stringify({ a: 1, b: 2 }) !== JSON.stringify({ b: 2, a: 1 })), and support more values than JSON (JSON.stringify([1n]) // throws).

While the primitive, value semantics, R&T design guarantees stability - the same indirection issues can arise because R&T could only store other primitives. So indirection (via symbols in weakMaps) was required to store non-primitive values (e.g. Temporal.ZonedDateTime), just like someone can provide a non-stable keyby function to a Map or Set, someone could have a non-stable lookup for the objects referenced by their R&T.

So to me the question is less of, is it possible to create an odd scenario with the API, and more of if the APIs encourage good patterns by default and does less safe usages of the API clearly standout when reading the code.

acutmore avatar Feb 20 '24 21:02 acutmore

My expectation is that a keying function would only ever be called once for any given value, and the result memoized.

If that's taken literally, it means it would need to remember multiple original key values, eg:

const map = new Map(k => {
    console.log("called");
    return JSON.stringify(k);
});
const k0 = { foo: 42 };
const k1 = { foo: 42 };
map.set(k0, "hello"); // "called"
map.set(k1, "world"); // "called"
map.get(k0); // ??
map.get(k1); // ??

Either both distinct key objects (k0 and k1) would need to be stored in the map, or at least one of the get operations will need to call the keying function again.

The former sounds very weird to me, since repeated sets of the "same" key will grow the map. Perhaps the implementation can mitigate this by having an implicit weak map associated (so temporary objects at least are not maintained), but this seems like a lot of undesirable overhead.

Maxdamantus avatar Feb 20 '24 21:02 Maxdamantus

yes, i'd expect k0 and k1 to both be stored somewhere, mapping to the json stringification of the key. Object keys would indeed need to be held weakly in the "somewhere".

ljharb avatar Feb 20 '24 21:02 ljharb

I'm just a JS developer, and not part of the standards body, so this is just my two cents, but my preference would be to either retain the value semantics or abandon the entire proposal.

Records and tuples with value semantics would be transformative. As I'm sure everyone posting here already knows, many popular libraries and frameworks in the JavaScript ecosystem (React, Redux, RxJS, etc) are "reactive" in the sense that they involve computing the difference between a new state and a prior state and taking some action based on the changes. Any programmer who works professionally with these libraries always needs to be aware of the reference semantics of objects and arrays; carelessly replacing one object or array with another, identical object or array can trigger a large amount of unnecessary downstream work or even an infinite loop. Records and tuples with value semantics would make it much easier to write correct, performant code in these situations.

Record and tuples without value semantics, by contrast, don't really help with this problem at all. It doesn't seem to me that they would offer much value in terms of solving my day-to-day problems programming in the JS ecosystem. Since there's a limited amount of nice syntax available in the language, if value semantics are not on offer, my preference would be to save the syntax for something more impactful.

sethfowler avatar Mar 28 '24 14:03 sethfowler