Equality definition rough plan?
What is the issue with the Infra Standard?
See: https://github.com/whatwg/infra/issues/230, https://github.com/whatwg/infra/issues/643, https://github.com/whatwg/infra/pull/660, https://github.com/whatwg/html/pull/10924, https://github.com/whatwg/html/issues/10315
I think the following would work:
- Define "equals" for all of Infra's primitive data types. Most are "obvious". Special notes:
- Infra doesn't really define numbers, but we should call out that we intend mathematical equality by default
- We already have a definition of "is" for strings
- Disallow using "equals" on lists and maps, since it is too confusing.
- Not 100% sure on the right path for structs, but my initial proposal:
- Say that it works by default by comparing itemwise
- Specifications can override the definition of "equals" for their particular struct (e.g. as URL does)
- Note this means that structs containing lists do not work by default, hmm.
- Say that every other type we use in specs defaults to "reference equality", but can define their custom equals overload if they want.
- Examples of such types that are not just structs in disguise: algorithm steps, IDL values, JS values
- With "equals" now defined for all types except lists and maps and things containing them, define "deep equals" for lists and maps, which recursively uses "deep equals" on keys/values/items
- This can replace https://infra.spec.whatwg.org/#set-equal
- If we think it's OK, per the above, for "equals" to not work on structs containing lists and maps, then we should define "deep equals" for structs.
- It kind of sucks that you have to check if your struct recursively contains list and maps, before determining which operator to use? But maybe it's good for clarity?
- Optionally define "shallow equals" for sets and maps, although I'm not sure we have any use cases
- Go through existing places in Infra that use ambiguous notions of equality, like our definition of "list contains", and replace them to use "equals".
Do we alias "equals" and "is"?
Can we make equals on a struct assert if:
- It doesn't have its own equals.
- One of the items holds a list or map.
And then maybe define an easy way to set its own equals to "struct deep equals" or some such. I guess it's still on callers to know whether it makes sense for the structs to be compared for quality, but at least they don't have to figure out the comparison call based on the struct's structure.
Looking at the use cases linked, seems like shallow equality of maps/lists/structs/sets is more needed/feasible than deep equality? I don't get what makes it confusing.
It can use is (ref/primitive equality) as the default, which as per the plan we should define better, and pass an item equality predicate - eg to determine whether URLs are compared with fragments excluded, to decide if a string should be compared with ignore-case, or to allow something that's equivalent to a deep-comparison without building that into infra.
This would provide value already, and then we can look at defining the equivalent of "operator overload" for other data types.
The issue with full-fledged deep equality is that: A) you have to deal with re-entrancy. Doable but not always the most efficient, and necessary only if it's a generic thing B) not always what you want. Sometimes you want to ref-compare something along the way. Only the caller knows, or whoever specified the particular data type.
FYI - the WebNN spec ended up using equality tests for lists (e.g. "If weight’s shape is not equal to « numDirections, 3 * hiddenSize, inputSize », then ...") e.g. and maps (most specifically with MLOperandDescriptor - a dictionary)
I filed https://github.com/webmachinelearning/webnn/issues/792 to track adding equality definitions for these to the spec, with the idea that maybe this could be upstreamed to Infra.
@noamr
It can use is (ref/primitive equality) as the default, [...] The issue with full-fledged deep equality is that: A) you have to deal with re-entrancy.
These are infra types, not JS types; they live in specs rather than in the page. There is no reference equality, and no re-entrancy.
It can use is (ref/primitive equality) as the default, [...] The issue with full-fledged deep equality is that: A) you have to deal with re-entrancy.
These are infra types, not JS types; they live in specs rather than in the page. There is no reference equality, and no re-entrancy.
I know that; But we use "is" for reference equality all over the place. I'm suggesting that we can define primitive equality as well. By re-entrancy I mean that a list can contain a struct where one of its elements references the list again, creating an infinite loop. In most cases this is a non-issue, but a generic deep-equality algorithm needs to handle this. perhaps it's not the right term.
Ah yeah, "cyclic references". Re-entrancy in the web context usually refers to accidentally triggering more JS while you're in the middle of an algorithm, which can potentially invalidate state in your algo.
Since maps are really ordered maps, do we need to consider the ordering in equality? I notice we don't do that for (ordered) sets. I suspect that in many cases (e.g. dictionaries) the processing from JS via IDL to Infra types gives a deterministic order so I can ignore the problem, but there may be other cases where it's relevant?
I'm not sure if that was intentional for sets. If it is we should call it out. cc @OrKoN
I'm not sure if that was intentional for sets. If it is we should call it out. cc @OrKoN
for https://infra.spec.whatwg.org/#set-equal it was intentional that it does not take ordering into account.
Great discussion everyone!
Do we alias "equals" and "is"?
That feels bad to me, for non-primitives.
Can we make equals on a struct assert if:
That was the intent of my proposal.
And then maybe define an easy way to set its own equals to "struct deep equals" or some such.
That's a pretty good idea, I think. "$Structname's [equality] is [deep struct equality]."
Looking at the use cases linked, seems like shallow equality of maps/lists/structs/sets is more needed/feasible than deep equality? I don't get what makes it confusing.
We might not be using the terms in the same way. Let me illustrate with JavaScript.
const struct1 = { a: "b" };
const struct2 = { a: "b" };
is(struct1, struct2); // false
shallowEquals(struct1, struct2); // true
deepEquals(struct1, struct2); // true
const struct3 = { x: struct1 };
const struct4 = { x: struct1 };
is(struct3, struct4); // false
shallowEquals(struct3, struct4); // false: item x is not "is" equals in struct3 vs. struct4
deepEquals(struct3, struct4); // true: item x is deepEquals in struct3 vs. struct4
I think this notion of shallowEquals is pretty bad as it breaks once you have any level of nesting. Even if it works in 80% of the cases since nesting is relatively rare, it's a very sharp edge for the remaining 20% of cases.
pass an item equality predicate
I think this could be necessary in rare cases, e.g. deciding whether to check fragment equality for URLs. But such cases might be better done via explicit iteration...
These are infra types, not JS types; they live in specs rather than in the page. There is no reference equality
I think we write specs assuming there is reference equality, sometimes. Probably even for structs.
This is especially the case if you take the view that many things in the specs, which are not currently defined as structs, should be defined as structs. For example, if I create two requests with all their values left as default, I intuitively feel I should be able to put both of them into a set, without them collapsing into a single value. However I admit this intuition isn't grounded in a real use case.
What do others thing of just not allowing reference equality for structs? What about for lists? Maye my JavaScript instincts are misleading me, and I should be thinking more like a C++ developer.
Cyclic references
I acknowledge this complicates the spec algorithm for sure. But I think encapsulating that in Infra is reasonable?
Since maps are really ordered maps, do we need to consider the ordering in equality?
This is a great point. I think this is another argument for disallowing the generic "equals" predicates and forcing people to choose between "deep equals" and "unordered deep equals". (Or maybe "ordered deep equals" and "unordered deep equals"? Or even "ordered deep equals" and "deep equals"?)
I agree with your intuition of how structs work. We probably need to expand on object identity and reference equality as we do also make those kind of comparisons, e.g., between nodes. I also think you want to compare structs using reference equality at times. Say you iterate over a list of them and want a condition to vary based on a pre-selected struct.
I think "unordered" needs to be explicit as it's a mismatch with the type. I would be okay with "ordered" also being explicit, but I also think it's fine if we don't label that "ordered". (Structs are not ordered however so they don't need the qualifier.)
Though here again it would make sense if you could define that for your set "equals" means "unordered deep equals" so callers don't have to know the particulars of your set and can always rely on "equals" (that will also make recursion work better where I suppose we branch on primitive/non-primitive and then use the default "equals").
So primitive types don't have "equals", they only have "is". Non-primitive types have "equals" (possibly plus "deep" and "unordered" qualifiers), but they also have "is" for identity comparison. That seems quite nice.