`SortedSet` invariants can be broken if `Element` violates `Comparable` requirements
SortedSet incorrectly reports that the element is not contained in it.
Information
- Package version: Branch
feature/SortedCollections - Platform version: macOS 12.2 Beta (21D5025f)
- Swift version:
swift-driver version: 1.26.21 Apple Swift version 5.5.2 (swiftlang-1300.0.47.5 clang-1300.0.29.30) Target: arm64-apple-macosx12.0
Checklist
- [x] I've searched for existing GitHub issues.
Steps to Reproduce
The issue happens when two elements are "NOT EQUAL" when compared via "==" operator, but "<" always returns false for them.
Here's a sample code:
struct Element: Comparable {
var id: UUID
var value: Int
static func < (lhs: Element, rhs: Element) -> Bool {
lhs.value < rhs.value
}
}
func test() {
var set = SortedSet<Element>()
let a = Element(id: UUID(), value: 0)
let b = Element(id: UUID(), value: 0)
// Inserts A
set.insert(a)
assert(set.contains(a))
// Inserts B
set.insert(b)
print(set.count) // ---> "2"
assert(set.contains(b)) // ---> CRASH
}
Expected behavior
set.contains returns true for the element that was inserted.
Actual behavior
set.contains returns false for the element that was already inserted.
This implementation of Comparable conformance violates the requirements of the protocol. Namely, two elements can both compare not less than each other but also not equal to each other, which is not semantically permissible—see the documentation for Comparable.
When the requirements of Comparable conformance are violated, SortedSet and indeed any sort is not required to do anything sensible. There is indeed a bug here, but it does not lie with SortedSet, rather with the user’s conformance to Comparable.
I concur! We can't say anything about the order of items in a SortedSet in this case -- because the concept of "sorting" isn't well-defined for types that don't correctly conform to Comparable.
Sadly this is a case of garbage in, garbage out -- like is the case with Set/Dictionary and Hashable, if Comparable requirements are violated, then almost all bets are off. All SortedSet can guarantee is that this will not lead to undefined behavior (like an out of bounds memory access). The invariants of the data structures will no longer apply -- methods/properties can and will return unexpected/nonsensical values.
As long as broken element types don't trigger invalid memory accesses or other undefined behavior in these data structures, the issue is entirely the fault of the broken type, not the data structure implementation.
Ideally code that relies on protocol requirements (like SortedSet in this case) would be able to verify at runtime that the supplied type correctly satisfies the requirements -- unfortunately though, doing that with a level of reliability would usually be overwhelmingly expensive, so the best we can do is to add cheap opportunistic checks that may trigger very inconsistently. (The hashed collections do have a few such tests (on debug-mode insertions and when the hash table is resized) -- however, these tests only trigger in a tiny fraction of cases, so (while they're still extremely useful) they are mostly a last resort fallback, not something that can be relied upon.)
There are probably ways to add similar opportunistic tests to these sorted data structures, too. Unfortunately, we actually have some types in the stdlib that intentionally violate Comparable requirements: as of Swift 5.6, not-a-number values of the standard floating point types aren't correctly ordered. This means that the order of items (or the results of lookup operations etc) in, say, a SortedSet<Double> becomes unspecified if the set ever includes one of these NaN values. Note that it may be preferable to allow this to happen rather than randomly trapping, so perhaps such tests are best added once we figure out how to resolve the floating point conformance issue.
Thanks folks. I know see how my Comparable implementation lead to logical errors.
I can't blame SortedSet here for that undefined behaviour. Safety checks would definitely be helpful but I agree their overhead cost might not be worth it.