STL icon indicating copy to clipboard operation
STL copied to clipboard

STL: We should _STD qualify _Ugly function calls to avoid ADL

Open StephanTLavavej opened this issue 6 years ago • 17 comments

See "ADL can interfere even with uglified names" by Arthur O'Dwyer.

We've survived 20+ years with unqualified _Ugly function calls, but perhaps we should consider changing this. It's possible that this could improve throughput.

List of headers to process, as mentioned in https://github.com/microsoft/STL/issues/140#issuecomment-1247361313 below (cvt subtree intentionally omitted):

  • [ ] __msvc_all_public_headers.hpp
  • [ ] __msvc_chrono.hpp
  • [ ] __msvc_format_ucd_tables.hpp
  • [ ] __msvc_int128.hpp
  • [ ] __msvc_iter_core.hpp
  • [ ] __msvc_system_error_abi.hpp
  • [ ] __msvc_tzdb.hpp
  • [ ] __msvc_xlocinfo_types.hpp
  • [ ] algorithm
  • [ ] any
  • [ ] array
  • [ ] atomic
  • [ ] barrier
  • [ ] bit
  • [ ] bitset
  • [ ] cassert
  • [ ] ccomplex
  • [ ] cctype
  • [ ] cerrno
  • [ ] cfenv
  • [ ] cfloat
  • [ ] charconv
  • [ ] chrono
  • [ ] cinttypes
  • [ ] ciso646
  • [ ] climits
  • [ ] clocale
  • [ ] cmath
  • [ ] codecvt
  • [ ] compare
  • [ ] complex
  • [ ] concepts
  • [ ] condition_variable
  • [ ] coroutine
  • [ ] csetjmp
  • [ ] csignal
  • [ ] cstdalign
  • [ ] cstdarg
  • [ ] cstdbool
  • [ ] cstddef
  • [ ] cstdint
  • [ ] cstdio
  • [ ] cstdlib
  • [ ] cstring
  • [ ] ctgmath
  • [ ] ctime
  • [ ] cuchar
  • [ ] cwchar
  • [ ] cwctype
  • [ ] deque
  • [ ] exception
  • [ ] execution
  • [ ] expected
  • [ ] experimental/coroutine
  • [ ] experimental/deque
  • [ ] experimental/filesystem
  • [ ] experimental/forward_list
  • [ ] experimental/generator
  • [ ] experimental/list
  • [ ] experimental/map
  • [ ] experimental/resumable
  • [ ] experimental/set
  • [ ] experimental/string
  • [ ] experimental/unordered_map
  • [ ] experimental/unordered_set
  • [ ] experimental/vector
  • [ ] filesystem
  • [ ] format
  • [ ] forward_list
  • [ ] fstream
  • [ ] functional
  • [ ] future
  • [ ] hash_map
  • [ ] hash_set
  • [ ] initializer_list
  • [ ] iomanip
  • [ ] ios
  • [ ] iosfwd
  • [ ] iostream
  • [ ] iso646.h
  • [ ] istream
  • [ ] iterator
  • [ ] latch
  • [ ] limits
  • [ ] list
  • [ ] llll
  • [ ] locale
  • [ ] map
  • [ ] memory
  • [ ] memory_resource
  • [ ] mutex
  • [ ] new
  • [ ] numbers
  • [ ] numeric
  • [ ] optional
  • [ ] ostream
  • [ ] queue
  • [ ] random
  • [ ] ranges
  • [ ] ratio
  • [ ] regex
  • [ ] scoped_allocator
  • [ ] semaphore
  • [ ] set
  • [ ] shared_mutex
  • [ ] source_location
  • [ ] span
  • [ ] spanstream
  • [ ] sstream
  • [ ] stack
  • [ ] stacktrace
  • [ ] stdatomic.h
  • [ ] stdexcept
  • [ ] stop_token
  • [ ] streambuf
  • [ ] string
  • [ ] string_view
  • [ ] strstream
  • [ ] syncstream
  • [ ] system_error
  • [ ] thread
  • [ ] tuple
  • [ ] type_traits
  • [ ] typeindex
  • [ ] typeinfo
  • [ ] unordered_map
  • [ ] unordered_set
  • [ ] use_ansi.h
  • [ ] utility
  • [ ] valarray
  • [X] variant (#3148)
  • [ ] vector
  • [ ] version
  • [ ] xatomic.h
  • [ ] xatomic_wait.h
  • [ ] xbit_ops.h
  • [ ] xcall_once.h
  • [ ] xcharconv.h
  • [ ] xcharconv_ryu.h
  • [ ] xcharconv_ryu_tables.h
  • [ ] xcharconv_tables.h
  • [ ] xerrc.h
  • [ ] xfacet
  • [ ] xfilesystem_abi.h
  • [ ] xhash
  • [ ] xiosbase
  • [ ] xkeycheck.h
  • [ ] xlocale
  • [ ] xlocbuf
  • [ ] xlocinfo
  • [ ] xlocmes
  • [ ] xlocmon
  • [ ] xlocnum
  • [ ] xloctime
  • [ ] xmemory
  • [ ] xnode_handle.h
  • [ ] xpolymorphic_allocator.h
  • [ ] xsmf_control.h
  • [ ] xstddef
  • [ ] xstring
  • [ ] xthreads.h
  • [ ] xtimec.h
  • [ ] xtr1common
  • [ ] xtree
  • [ ] xutility
  • [ ] ymath.h
  • [ ] yvals.h
  • [ ] yvals_core.h

StephanTLavavej avatar Sep 26 '19 23:09 StephanTLavavej

Your std::distance doesn't trigger the issue; but anything that calls _Get_unwrapped does. For example, std::copy.

https://godbolt.org/z/q5TeEO

struct Incomplete;
template<class T> struct Holder { T t; };

int main() {
    using Ptr = Holder<Incomplete>*;
    Ptr a[100];
    Ptr b[100];
    std::copy(a, a+10, b);
}

Quuxplusone avatar Sep 27 '19 03:09 Quuxplusone

That example is compelling, although the thought of changing all _Ugly function calls fills me with despair (more so than operator& and operator,). It seems that there are a lot of places that could accept elements of type Holder<Incomplete>* - not just algorithms, but also containers and their member functions. We use tag dispatch less frequently these days, but the number of function calls susceptible to ADL hijacking must extend far beyond _Get_unwrapped if I understand this correctly.

I am aware of (apparently) conformant code that instantly destroys any library it comes into contact with, but it is very pathological (e.g. highly poisoned operator, overloads). The element type here is fairly reasonable.

StephanTLavavej avatar Sep 27 '19 03:09 StephanTLavavej

I think it's impossible to write a library that is immune to all variations on Holder<Incomplete>. For example, you can't sort my array a — https://godbolt.org/z/FsV8tt — because sort inherently depends on being able to ADL-swap the element type, which you definitely can't ever do if any associated type is incomplete.

I've been trying to come up with an example where the library somewhere has to use an infix operator and we supply a class type without making it too contrived. The least-contrived thing I've come up with so far is

using T = Holder<Incomplete>*;
T t;
std::optional<T> ot;

Here t == t is OK, but ot == ot is a hard error.

std::vector<T> vt;
std::vector<std::optional<T>> vo;

std::find(vt.begin(), vt.end(), nullptr);  // technically implementable, but nobody actually does
std::find(vo.begin(), vo.end(), nullptr);  // literally unimplementable?

I believe it is impossible to accept std::find(vo.begin(), vo.end(), nullptr), because it has to be able to find std::optional's operator== via ADL in order to do its job but if it does ADL and tries to complete Holder<Incomplete> then it blows up.

Quuxplusone avatar Sep 27 '19 04:09 Quuxplusone

In that case, I think I want the LWG to tell us what to do.

StephanTLavavej avatar Sep 27 '19 04:09 StephanTLavavej

Richard Smith has brought up similar issues in the past for places we can't fix with qualification, like operator overloads. I don't think there's anything reasonable the library can do to shut this problem down completely but qualifying our _Ugly things is probably still reasonable on the basis of improved diagnostics and/or throughput.

BillyONeal avatar Sep 30 '19 21:09 BillyONeal

Which option do we prefer:

  • WONTFIX
  • _STD _Meow(/*...*/) [longer/hard to miss in code review]
  • (_Meow)(/*...*/)[shorter/easier to miss in code review]

I don't list ::std::_Meow(/*...*/) as an option here because we recently relitigated "_STD vs. ::std::" when Nicole joined the team.

CaseyCarter avatar Sep 14 '22 08:09 CaseyCarter

Looking at the long list of things attached to this, and:

Richard Smith has brought up similar issues in the past for places we can't fix with qualification, like operator overloads. I don't think there's anything reasonable the library can do to shut this problem down completely but qualifying our _Ugly things is probably still reasonable on the basis of improved diagnostics and/or throughput.

is there customer value to be gained here given that we are seemingly unable to shut the problem down completely?

EDIT: To clarify, this is more about the linked issues being marked as 'duplicates' of this one rather than this change itself.

BillyONeal avatar Sep 14 '22 20:09 BillyONeal

We talked about this at the weekly maintainer meeting, and we've decided to qualify (with _STD, _RANGES, _CHRONO, etc.) every _Ugly function call, like what we currently do for non-_Ugly function calls. This will avoid ADL issues as much as possible, and (as bonuses) will improve throughput by some nonzero amount (potentially significant), and will sometimes avoid infinite recursion during lookup. We'll be able to enforce _STD etc. qualification the same way we do for non-_Ugly calls (where we have a near-perfect record of detecting issues during code review), and it doesn't impact code clarity too much.

However, I want to land the Standard Library Modules PR (which modifies many lines of every header) before accepting any PRs that attempt to qualify _Ugly function calls.

When we're ready to begin qualifying all such calls, this can be done in one or a handful of headers at a time - a single massive PR updating all headers atomically is not necessary (and would be a pain to review and keep up to date with main).

@CaseyCarter mentioned that we can make a checklist of headers in this issue, when we're ready to begin accepting PRs.

StephanTLavavej avatar Sep 14 '22 22:09 StephanTLavavej

It's worthwhile to notice that operator* involves ADL, and thus we should be more careful when using smart pointers as implementation details.😹

frederick-vs-ja avatar Sep 15 '22 01:09 frederick-vs-ja

It's worthwhile to notice that operator* involves ADL, and thus we should be more careful when using smart pointers as implementation details.😹

This is kind of the crux of my statement above. _STD qualifying all these things is not the absolute ADL defenses the bugs dupe'd to this one are asking for. Saying things like 'vector's destructor can't call _Destroy_range because that accepts an iterator pair' is a MUCH stronger statement than this issue is trying to deliver.

BillyONeal avatar Sep 15 '22 01:09 BillyONeal

~Should we also _CSTD- or ::-qualify _Ugly names (including intrinsics) in the global namespace?~

We should. Such qualifying is needed.

frederick-vs-ja avatar Oct 17 '22 15:10 frederick-vs-ja

This is failing even after #3148. The problem is that there are intentional ADL swaps. Is this a resolvable problem?

#include <variant>

struct incomplete;

template<class T>
struct wrapper { T a; };

using inp = wrapper<incomplete>*;

int main() {
    std::variant<inp> v;
    v.swap(v); // ...
}

achabense avatar Aug 16 '23 07:08 achabense

Is this a resolvable problem?

I don't think this is solvable, unless we change the standard wording to restrict ADL for swap to enumeration and class types (like ranges::swap).

frederick-vs-ja avatar Aug 17 '23 02:08 frederick-vs-ja

restrict ADL for swap to enumeration and class types

That would actually be philosophically safe (because non-enum/class types are primitive and therefore don't require any special swap semantics)... but at the same time it wouldn't be enough to solve the problem! Consider std::variant<std::vector<Holder<Incomplete>*>> v; v.swap(v), which would still try to swap std::vector<Holder<Incomplete>*> objects, do ADL because it's a class type, find Holder<Incomplete> as an associated type, and explode. So there are always some places that fundamentally can't be ADL-proofed. But FWLIW I'm still personally of the opinion that whatever can be ADL-proofed, should be. Whenever an STL facility explodes due to ADL, I should be able to point at a specific piece of it and say, "Look, there's literally no way to do this piece without ADL." It should never be a simple five-character fix.

I note that my first comment on this issue https://github.com/microsoft/STL/issues/140#issuecomment-535780331 was about optional<T>, and https://github.com/microsoft/STL/issues/140#issuecomment-1247455660 is about using types like this as implementation details; and then this month we have LWG 3969 which is precisely about using optional<T> as an implementation detail (such that it's ADL-proof to call init.value(), but not ADL-proof to call *init as currently specified in [alg.fold]/10).

Quuxplusone avatar Aug 17 '23 17:08 Quuxplusone

I think there should be a way to test Can unqualified call (so we can do constexpr dispatch for them). Unfortunately so far there is no way to do that.

achabense avatar Aug 17 '23 17:08 achabense

ranges::swap is *slightly* better than unqualified swap:

#include <iostream>
#include <ranges>
using namespace std;

struct inc;
template<class T> struct wrp { T t; };
using inp = wrp<inc>*;

int main()
{
    inp a{}, b{};
    _RANGES swap(a, b); // ok...

    using pr = std::pair<inp, inp>;
    pr c{}, d{};
    // _RANGES swap(c, d); // no...
}

I'm curious, what's the advantage of ADL-based swap over public swap method?

achabense avatar Sep 07 '23 04:09 achabense

Currently no implementation accepts this example due to ADL in the comparison of iterators (Godbolt link). However, the standard doesn't seem to state that this is UB/ill-formed.

#include <vector>

struct incomplete;

template<class T>
struct holder { T t; };

int main()
{
    std::vector<holder<incomplete>*> v{};
    v.emplace_back();
}

It seems that iterators of standard containers shouldn't be implemented as class templates, and we can't fix this without breaking ABI.

frederick-vs-ja avatar Oct 21 '23 08:10 frederick-vs-ja