STL icon indicating copy to clipboard operation
STL copied to clipboard

`<xutility>`: Evil member `element_type` can break non-ranges algorithms since C++20

Open frederick-vs-ja opened this issue 1 year ago • 4 comments

Describe the bug

Currently, MSVC STL defines _Iter_value_t in term of iter_value_t since C++20 and uses it in non-ranges algorithms.

https://github.com/microsoft/STL/blob/8b081e26ba016970ce3338cb483ff10f2ade30a5/stl/inc/xutility#L1167-L1168 https://github.com/microsoft/STL/blob/8b081e26ba016970ce3338cb483ff10f2ade30a5/stl/inc/xutility#L1177-L1178

Per [readable.traits], if a type I has mismatched member types value_type and element_type, then iter_value_t<I> is ill-formed by default (when there's no program-defined indirectly_readable_traits or iterator_traits specializations), and thus I doesn't satisfy any C++20 meow_iterator concept. However, I can still meet Cpp17MeowIterator requirements because the legacy requirements don't consider element_type, so such strategy sometimes rejects valid programs.

For example, this example should be valid since C++17. But MSVC STL starts to reject it since C++20 while other implementations don't (Godbolt link).

#include <algorithm>
#include <cassert>
#include <cstddef>
#include <iterator>
#include <type_traits>

template <class T>
struct evil_contiguous_iterator {
    T* ptr;

    template <class CT = const T, std::enable_if_t<!std::is_const_v<T>, int> = 0>
    constexpr operator evil_contiguous_iterator<CT>() const noexcept {
        return {ptr};
    }

    using value_type        = std::remove_cv_t<T>;
    using reference         = T&;
    using pointer           = T*;
    using difference_type   = std::ptrdiff_t;
    using iterator_category = std::random_access_iterator_tag;

    using element_type = void; // This is EVIL but shouldn't affect non-ranges algorithms.

    friend constexpr bool operator==(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr == j.ptr;
    }
    friend constexpr bool operator!=(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr != j.ptr;
    }
    friend constexpr bool operator<(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr < j.ptr;
    }
    friend constexpr bool operator>(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr > j.ptr;
    }
    friend constexpr bool operator<=(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr <= j.ptr;
    }
    friend constexpr bool operator>=(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr >= j.ptr;
    }

    constexpr evil_contiguous_iterator& operator++() noexcept {
        ++ptr;
        return *this;
    }
    constexpr evil_contiguous_iterator operator++(int) noexcept {
        auto that = *this;
        ++*this;
        return that;
    }
    constexpr evil_contiguous_iterator& operator--() noexcept {
        --ptr;
        return *this;
    }
    constexpr evil_contiguous_iterator operator--(int) noexcept {
        auto that = *this;
        --*this;
        return that;
    }

    constexpr evil_contiguous_iterator& operator+=(difference_type n) noexcept {
        ptr += n;
        return *this;
    }
    constexpr evil_contiguous_iterator& operator-=(difference_type n) noexcept {
        ptr -= n;
        return *this;
    }

    friend constexpr evil_contiguous_iterator operator+(evil_contiguous_iterator i, difference_type n) noexcept {
        return {i.ptr + n};
    }
    friend constexpr evil_contiguous_iterator operator+(difference_type n, evil_contiguous_iterator i) noexcept {
        return {i.ptr + n};
    }
    friend constexpr evil_contiguous_iterator operator-(evil_contiguous_iterator i, difference_type n) noexcept {
        return {i.ptr - n};
    }
    friend constexpr difference_type operator-(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr - j.ptr;
    }

    constexpr T& operator*() const noexcept {
        return *ptr;
    }
    constexpr T& operator[](difference_type n) const noexcept {
        return ptr[n];
    }
    constexpr T* operator->() const noexcept {
        return ptr;
    }
};

int main() {
    int a[]{4, 2, 1, 7, 2, 9};
    std::sort(evil_contiguous_iterator<int>{std::begin(a)}, evil_contiguous_iterator<int>{std::end(a)});
}

Expected behavior

This example compiles in C++20/23 modes as in C++17.

STL version

Probably every version from 1e8b8d4eef4b2dddeb7533c5231c876383bd0ea6 to 8b081e26ba016970ce3338cb483ff10f2ade30a5.

Additional context

WG21-P2408R5 complicated the situation.

  • As of C++23, requirements on constant iterators for non-ranges algorithms were changed to meowiterator concepts, which would newly require well-formedness of iter_value_t in some cases. But std::sort wasn't affected.
  • However, as implied the 6.2. section of the paper, such breakage was unlikely found by the author, and thus was possibly unintended.
  • MSVC STL implemented the changes in C++20 mode and permitted both meowiterator and Cpp17MeowIterator.

frederick-vs-ja avatar Mar 03 '24 08:03 frederick-vs-ja

Incorrect attempt

IMO a "fix" could be implementing `_Iter_value_t` like the following since C++20.

template <class _Iter>
struct _Iter_value_fallback : indirectly_readable_traits<_Iter> {};

template <_Has_member_value_type _Iter>
struct _Iter_value_fallback<_Iter> : _Cond_value_type<typename _Iter::value_type> {};

template <_Has_member_element_type _Iter>
    requires (!_Has_member_value_type<_Iter>)
struct _Iter_value_fallback<_Iter> : _Cond_value_type<typename _Iter::element_type> {};

template <class _Iter>
using _Iter_value_t = conditional_t<_Is_from_primary<iterator_traits<remove_cvref_t<_Iter>>>,
    _Iter_value_fallback<remove_cvref_t<_Iter>>, iterator_traits<remove_cvref_t<_Iter>>>::value_type;

Perhaps we can add an internal typedef name to this specialization for workaround: https://github.com/microsoft/STL/blob/8b081e26ba016970ce3338cb483ff10f2ade30a5/stl/inc/__msvc_iter_core.hpp#L153-L155

frederick-vs-ja avatar Mar 03 '24 08:03 frederick-vs-ja

We talked about this at the weekly maintainer meeting, although @CaseyCarter was out sick so our analysis powers were limited. This was introduced by #329. Switching ugly _Iter_value_t to be C++20 iter_value_t with different semantics appears to be a bug for the reasons you've explained, but #329 said that this was to "enable unwrapping of C++20 move-only single-pass iterators". It appears that many uses of ugly _Iter_value_t can't possibly be handling such C++20 move-only single-pass iterators (because the uses are in stable_sort() etc. where stronger iterators are required).

We believe that a full audit of all ugly _Iter_value_t usage will be necessary, to determine which ones:

  • Should always expand to iterator_traits::value_type
  • Should have this behavior of conditionally switching between iterator_traits::value_type and C++20 iter_value_t
  • Should always be C++20 iter_value_t

In general, we should avoid having aliases that switch their semantics based on Standard mode, except when explicitly commented/known that their downlevel behavior is a "best effort" approximation with no harmful downsides (e.g. <xutility>'s _Iterator_is_contiguous is ok).

StephanTLavavej avatar Mar 06 '24 22:03 StephanTLavavej

We believe that a full audit of all ugly _Iter_value_t usage will be necessary, to determine which ones:

  • Should always expand to iterator_traits::value_type
  • Should have this behavior of conditionally switching between iterator_traits::value_type and C++20 iter_value_t
  • Should always be C++20 iter_value_t

Per WG21-P2248R7, it seems that the value type should be consistently iterator_traits::value_type for non-ranges algorithms. Perhaps conditionally switching is only applicable to move_iterator and reverse_iterator's value_type.

frederick-vs-ja avatar Apr 09 '24 05:04 frederick-vs-ja

Filed LWG-4080 for this.

frederick-vs-ja avatar Apr 27 '24 16:04 frederick-vs-ja