STL `flat_set`'s transparent insertion requirements are deficient

The precondition of flat_set's transparent insertion is specified as follows:

https://eel.is/c++draft/flat.set#modifiers-2 Preconditions: The conversion from x into value_type constructs an object u, for which find(x) == find(u) is true.

... which is highly problematic. #4084 tried to add the following test. As fs.find(2) == fs.end(), and fs.find(2-10) == fs.end(), this is currently allowed by the standard.

void test_insert_transparent_partially_inconsistent() {
    struct broken_key {
        int key;
        explicit operator int() const {
            return key - 10;
        }
    };

    flat_set<int, key_comparer/*comparator able to extract ".key" if present*/> fs{ 0, 3, 5 };

    // fs.find(2) == fs.end(), and fs.find(2-10) == fs.end(), so this is currently allowed by the standard:
    fs.insert(broken_key{ 2 });
    // ... (assert fs has "{-8, 0, 3, 5}")
    // ...
}

The problem is that the precondition cannot ensure the lower_bound(transparent-key) be the same as lower_bound(convert-result). If we must adhere to the current specification, after deciding that the transparent-key is not contained (by finding its lower_bound and doing the comparision), we still need to verify that the bound is also the real lower_bound for the convert-result; and if not, we need to fall back to another (non-transparent) insert function.

The condition above is unlikely useful but is generally undetectable, and the solution will have considerable performance impact. As commented in #4084, the real solution might be making the precondition more restrictive in the standard.

The current implementation will work fine if the precondition is specified as:

The conversion from x into value_type constructs an object u, for which "contains(x) == contains(u) && lower_bound(x) == lower_bound(u)" is true.

Oct 19 '23 02:10 achabense

I don't understand the issue, possibly because the example is incomplete. What's key_comparer? What requirement are we unable to implement?

The precondition seems sufficient to guarantee that find(broken_key{2}) will locate int(broken_key{2}) (i.e., -8) if and only if it exists in the flat_set; what else does it need to do?

Oct 19 '23 03:10 CaseyCarter

What's key_comparer?

struct key_comparer {
    const auto& extract_key(const auto& obj) const {
        if constexpr (requires { obj.key; }) {
            return obj.key;
        } else {
            return obj;
        }
    }

    bool operator()(const auto& lhs, const auto& rhs) const {
        return extract_key(lhs) < extract_key(rhs);
    }

    using is_transparent = int;
};

To support fs.insert(broken_key{ 2 });, the flat_set should firstly find lower_bound(broken_key{2}), which is after 0 and before 3. Then flat_set decides it is not contained, so convert it to int(-8), and try to do the insertion. However, that lower_bound is unusable as insert position. The current impl is assuming lower_bound should be insertable here. Otherwise, flat_set have to make another common insert(int(-8)) call. See https://github.com/microsoft/STL/pull/4084/commits/0296416a531f367075ffd0b324754bc0085118ae.

Oct 19 '23 03:10 achabense

The current impl is assuming lower_bound should be insertable here.

This seems like a bug in the implementation.

Otherwise, flat_set have to make another common insert(int(-8)) call.

Is that not conforming? insert must have linear complexity, we "have time" to perform two logarithmic searches. Ideally we'd perform another binary search for -8 starting at the position we found looking for broken_key{2} - with the expectation that it will often be close to the correct insertion position - but even that is an optimization, not a requirement.

Heck, we could perform two linear searches and it would be poor Quality of Implementation, but still conforming.

Oct 19 '23 03:10 CaseyCarter

This issue is motivated by https://github.com/microsoft/STL/pull/4084#discussion_r1364691294. #4084 tried to make the insertion "conformant". However, I believe the current specification is a defect in the standard, and will put transparent insertion in a logically unsound state.

As long as transparent comparision and conversion don't represent the same thing, under current precondition (find(transparent-key) == find(converted)), the insertion is valid on a value-dependent basis, relying on both the value of the key and the contents in flat_map. Take the sample code for example again, the expression is initially valid by the current standard (1); however, the same expression (2) will become invalid immediately after (1). This is generally true for such inconsistent types.

flat_set<int, key_comparer/*comparator able to extract ".key" if present*/> fs{ 0, 3, 5 };

// [searching: fs.find(2) == fs.end()], and [converted: fs.find(2-10) == fs.end()], so this is currently
// ↓ (1) allowed by the standard:
fs.insert(broken_key{ 2 }); // ... (assert fs has "{-8, 0, 3, 5}")
// ↓ (2) now precondition violation: [searching: fs.find(2) == fs.end()], but [converted: fs.find(2-10) != fs.end()]
// fs.insert(broken_key{ 2 });

And generally there is no way to decide whether such a transparent insertion will break invariants without doing the real conversion. For the first expression (1), without knowing fs has { 0, 3, 5 }, we have no way to decide whether that expression will be valid in advance. We can only know fs contains broken_key{ 2 } by find it. However if the transparent key doen't represent the same thing in comparision and conversion, we can say nothing about what will be the result of find(conversion-result) without doing the real conversion.

I believe allowing inconsistent types on a value-dependent basis makes no benefit. Worse, for user-provided transparent-key&comparator, we cannot decide whether inconsistency can happen - only the standard is able to rule out inconsistent types.

Oct 19 '23 13:10 achabense

We talked about this at the weekly maintainer meeting, although we didn't have time to track down all of the Standardese. (The main question in our minds is whether this affects the existing ordered/unordered associative containers too.)

We strongly believe that the Standard's requirements for transparent insertion should say that the freshly constructed element should sort into the same position as the transparent comparison indicates, otherwise the Standard is asking for an additional comparison after the element has been constructed (which largely defeats the purpose of the transparent machinery), to avoid violating container invariants.

Oct 25 '23 21:10 StephanTLavavej

This wording was copied from an older revision of WG21-P2363. LWG review of that paper led to wording changes (see WG21-P2363R5) but we'll need an issue to apply the same changes to all the flat_meows.

Oct 27 '23 05:10 timsong-cpp

LWG-4048 is filed.

Jan 28 '24 10:01 frederick-vs-ja