stl2
stl2 copied to clipboard
Specify the domains of the required operations of the standard concepts
@CaseyCarter @asutton @sean-parent Trouble here since the standard already uses the term "well-formed" when describing a program that conforms to the syntax of the language. It's also used when describing certain expressions. Presumably a well-formed program consists of well-formed statements and expressions, so this usage seems internally consistent.
Using "well-formed" to describe the values of objects is novel, so the committee is right to flag its use. So we have two issues:
- What term to use to describe well-formed objects
- How to define that term
We could extend the existing term to include the values of objects; e.g. a "well-formed object". That's potentially confusing since for an object t
, the expression "t
" is well-formed while the object t
might not be well-formed
. (E.g., it could be a double
storing a NaN.) Perhaps choosing a different term would be better (although inconsistent with EoP).
Maybe:
"An object is well-formed if it does not have indeterminate value".
There are some exceptions about the validity of the use of indeterminate value for narrow character types:
http://en.cppreference.com/w/cpp/language/default_initialization
But I can't seem to find those in the standard. But I don't think we need to worry about those cases in templates.
I had thought to exclude moved-from objects, but if the state is valid and the type is equality comparable, then you could conceivably recover information after a move:
T x = move(y); if (y == T()) { // Do stuff with default y. }
I doubt that this is useful, and it's probably bad practice, but it would be valid.
Andrew Sutton
On Wed, Sep 9, 2015 at 2:29 PM, Eric Niebler [email protected] wrote:
@CaseyCarter https://github.com/CaseyCarter @asutton https://github.com/asutton @sean-parent https://github.com/sean-parent Trouble here since the standard already uses the term "well-formed" when describing a program that conforms to the syntax of the language. It's also used when describing certain expressions. Presumably a well-formed program consists of well-formed statements and expressions, so this usage seems internally consistent.
Using "well-formed" to describe the values of objects is novel, so the committee is right to flag its use. So we have two issues:
- What term to use to describe well-formed objects
- How to define that term
We could extend the existing term to include the values of objects; e.g. a "well-formed object". That's potentially confusing since for an object t, the expression "t" is well-formed while the object t might not be well-formed. (E.g., it could be a double storing a NaN.) Perhaps choosing a different term would be better (although inconsistent with EoP).
— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/88#issuecomment-138998966.
Is a NaN an indeterminate value? In what sense?
On Wed, Sep 9, 2015, 12:02 PM Andrew Sutton [email protected] wrote:
Maybe:
"An object is well-formed if it does not have indeterminate value".
There are some exceptions about the validity of the use of indeterminate value for narrow character types:
http://en.cppreference.com/w/cpp/language/default_initialization
But I can't seem to find those in the standard. But I don't think we need to worry about those cases in templates.
I had thought to exclude moved-from objects, but if the state is valid and the type is equality comparable, then you could conceivably recover information after a move:
T x = move(y); if (y == T()) { // Do stuff with default y. }
I doubt that this is useful, and it's probably bad practice, but it would be valid.
Andrew Sutton
On Wed, Sep 9, 2015 at 2:29 PM, Eric Niebler [email protected] wrote:
@CaseyCarter https://github.com/CaseyCarter @asutton https://github.com/asutton @sean-parent < https://github.com/sean-parent> Trouble here since the standard already uses the term "well-formed" when describing a program that conforms to the syntax of the language. It's also used when describing certain expressions. Presumably a well-formed program consists of well-formed statements and expressions, so this usage seems internally consistent.
Using "well-formed" to describe the values of objects is novel, so the committee is right to flag its use. So we have two issues:
- What term to use to describe well-formed objects
- How to define that term
We could extend the existing term to include the values of objects; e.g. a "well-formed object". That's potentially confusing since for an object t, the expression "t" is well-formed while the object t might not be well-formed. (E.g., it could be a double storing a NaN.) Perhaps choosing a different term would be better (although inconsistent with EoP).
— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/88#issuecomment-138998966.
— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/88#issuecomment-139010369.
Is a NaN an indeterminate value? In what sense?
It isn't, and trying to exclude it will get you into a long discussion with Lawrence Crowl. There are (probably) some generic libraries designed that take that NaN representation into account.
My interpretation is that it's okay to have NaN values, if you're writing algorithms on IEEE 754 values. If your algorithm works on the set of Real values, and you get NaN's, then you have undefined behavior.
On Wed, Sep 9, 2015, 12:02 PM Andrew Sutton [email protected] wrote:
Maybe:
"An object is well-formed if it does not have indeterminate value".
There are some exceptions about the validity of the use of indeterminate value for narrow character types:
http://en.cppreference.com/w/cpp/language/default_initialization
But I can't seem to find those in the standard. But I don't think we need to worry about those cases in templates.
I had thought to exclude moved-from objects, but if the state is valid and the type is equality comparable, then you could conceivably recover information after a move:
T x = move(y); if (y == T()) { // Do stuff with default y. }
I doubt that this is useful, and it's probably bad practice, but it would be valid.
Andrew Sutton
On Wed, Sep 9, 2015 at 2:29 PM, Eric Niebler [email protected] wrote:
@CaseyCarter https://github.com/CaseyCarter @asutton https://github.com/asutton @sean-parent < https://github.com/sean-parent> Trouble here since the standard already uses the term "well-formed" when describing a program that conforms to the syntax of the language. It's also used when describing certain expressions. Presumably a well-formed program consists of well-formed statements and expressions, so this usage seems internally consistent.
Using "well-formed" to describe the values of objects is novel, so the committee is right to flag its use. So we have two issues:
- What term to use to describe well-formed objects
- How to define that term
We could extend the existing term to include the values of objects; e.g. a "well-formed object". That's potentially confusing since for an object t, the expression "t" is well-formed while the object t might not be well-formed. (E.g., it could be a double storing a NaN.) Perhaps choosing a different term would be better (although inconsistent with EoP).
— Reply to this email directly or view it on GitHub <https://github.com/ericniebler/stl2/issues/88#issuecomment-138998966 .
— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/88#issuecomment-139010369.
— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/88#issuecomment-139011385.
So, double does not satisfy EqualityComparable or TotallyOrdered?
On Wed, Sep 9, 2015, 12:10 PM Andrew Sutton [email protected] wrote:
Is a NaN an indeterminate value? In what sense?
It isn't, and trying to exclude it will get you into a long discussion with Lawrence Crowl. There are (probably) some generic libraries designed that take that NaN representation into account.
My interpretation is that it's okay to have NaN values, if you're writing algorithms on IEEE 754 values. If your algorithm works on the set of Real values, and you get NaN's, then you have undefined behavior.
On Wed, Sep 9, 2015, 12:02 PM Andrew Sutton [email protected] wrote:
Maybe:
"An object is well-formed if it does not have indeterminate value".
There are some exceptions about the validity of the use of indeterminate value for narrow character types:
http://en.cppreference.com/w/cpp/language/default_initialization
But I can't seem to find those in the standard. But I don't think we need to worry about those cases in templates.
I had thought to exclude moved-from objects, but if the state is valid and the type is equality comparable, then you could conceivably recover information after a move:
T x = move(y); if (y == T()) { // Do stuff with default y. }
I doubt that this is useful, and it's probably bad practice, but it would be valid.
Andrew Sutton
On Wed, Sep 9, 2015 at 2:29 PM, Eric Niebler <[email protected]
wrote:
@CaseyCarter https://github.com/CaseyCarter @asutton https://github.com/asutton @sean-parent < https://github.com/sean-parent> Trouble here since the standard already uses the term "well-formed" when describing a program that conforms to the syntax of the language. It's also used when describing certain expressions. Presumably a well-formed program consists of well-formed statements and expressions, so this usage seems internally consistent.
Using "well-formed" to describe the values of objects is novel, so the committee is right to flag its use. So we have two issues:
- What term to use to describe well-formed objects
- How to define that term
We could extend the existing term to include the values of objects; e.g. a "well-formed object". That's potentially confusing since for an object t, the expression "t" is well-formed while the object t might not be well-formed. (E.g., it could be a double storing a NaN.) Perhaps choosing a different term would be better (although inconsistent with EoP).
— Reply to this email directly or view it on GitHub < https://github.com/ericniebler/stl2/issues/88#issuecomment-138998966 .
— Reply to this email directly or view it on GitHub <https://github.com/ericniebler/stl2/issues/88#issuecomment-139010369 .
— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/88#issuecomment-139011385.
— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/88#issuecomment-139015819.
FYI, this is what the Palo Alto report has to say about it:
Obviously, not all arguments will be well formed for any given type. For example
NaN
is not well-formed data and does not satisfy any of the axioms of the equivalence relation above. Similarly, the expression*p
is not well formed whenp == nullptr
. Because it results in undefined behavior, it would be inappropriate to consider that value when describing the general semantics of equality. Note that these exceptional cases do not contradict the axioms of the concept. This is not proof, for example, that IEEE 754 floating point types are not EqualityComparable. The fact that some representations of the format can be interpreted as non-values, does not imply any properties of valid interpretations.
That is what I've been trying to capture in the ranges proposal.
So, double does not satisfy EqualityComparable or TotallyOrdered?
Can you pass NaNs to sort? No. Can you pass them to an algorithm designed to deal with them while comparing other values using == or <? It seems like a reasonable thing to do.
There are people that will throw a fit if you make NaNs ill-formed as a general rule. I got that feedback after publishing the Palo Alto report.
There are some exceptions about the validity of the use of indeterminate value for narrow character types ... but I can't seem to find those in the standard.
[dcl.init]/12 says that evaluating an indeterminate value results in UB, and makes several exceptions for "unsigned narrow character types" such that evaluating an indeterminate value results in an indeterminate value.
EoP defines "well-formed" in section 1.2:
A datum is well-formed with respect to a value type if and only if that datum represents an abstract entity. ...an IEEE 754 floating-point NaN is not well-formed when interpreted as a real number.
Since we don't have definitions of "datum," "value type," or "abstract entity," I think we should avoid using that definition of "well-formed" even ignoring the existing overloads of the term in the Standard. Our goal in defining requirements for operations is to require the operations to have the specified semantics where they are well-defined - the definition space of the operations need not be total. ==
is not well-defined for double
when at least one of the arguments is a NaN, and <
is only defined for pointers into the same array, etc.
It might be simplest to just remove the usage of "well-formed" in the concept requirements and put a paragraph in the concepts.lib intro specifying that the operations need not be total, but must have the required semantics where they are defined.
It is a bit more complex than that - I can have two well-formed iterators that are not equality comparable but that doesn't mean that iterators are not equality comparable. Rather, every operation has a domain within which it is valid. So doubles are comparable within the domain where they can be compared. NaN is not in that domain.
On Wed, Sep 9, 2015 at 12:38 PM, Eric Niebler [email protected] wrote:
FYI, this is what the Palo Alto report has to say about it:
Obviously, not all arguments will be well formed for any given type. For example NaN is not well-formed data and does not satisfy any of the axioms of the equivalence relation above. Similarly, the expression *p is not well formed when p == nullptr. Because it results in undefined behavior, it would be inappropriate to consider that value when describing the general semantics of equality. Note that these exceptional cases do not contradict the axioms of the concept. This is not proof, for example, that IEEE 754 floating point types are not EqualityComparable. The fact that some representations of the format can be interpreted as non-values, does not imply any properties of valid interpretations.
That is what I've been trying to capture in the ranges proposal.
— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/88#issuecomment-139024126.
It is a bit more complex than that - I can have two well-formed iterators that are not equality comparable but that doesn't mean that iterators are not equality comparable. Rather, every operation has a domain within which it is valid. So doubles are comparable within the domain where they can be compared. NaN is not in that domain.
And the same for subtracting RA iterators from different ranges. I don't think I've seen an axiomatization of that restriction, and I'm not sure I know how to write it.
If we're looking for an example of an operation that is only well-defined over a restricted domain, I don't think we could go wrong with division. Everyone who will ever read the TS is familiar with division and comfortable with the idea that division is a valid operation on real numbers despite that fact that division by zero is not defined. If we start with the easy stuff, they're more likely to stick with us when we claim that double
is EqualityComparable
despite the existence of NaN
s, and that pointers are TotallyOrdered
despite the fact that <
is only defined over pointers into the same array.
Well, no... Division by 0.0 is defined, and the result depends on the dividend.
I think Sean's point (it's definitely mine) was that there are cases of difficult-to-specify sets of values (or states) for which the behavior of an operation is undefined, even though we generally want it to be.
Maybe it's sufficient to say that a concept's semantic requirements for an operation apply only to values defined to be in the domain of that operation by their type. That needs wordsmithing.
Andrew Sutton
On Thu, Sep 10, 2015 at 9:49 PM, Casey Carter [email protected] wrote:
If we're looking for an example of an operation that is only well-defined over a restricted domain, I don't think we could go wrong with division. Everyone who will ever read the TS is familiar with division and comfortable with the idea that division is a valid operation on real numbers despite that fact that division by zero is not defined. If we start with the easy stuff, they're more likely to stick with us when we claim that double is EqualityComparable despite the existence of NaNs, and that pointers are TotallyOrdered despite the fact that < is only defined over pointers into the same array.
— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/88#issuecomment-139427958.
Well, no... Division by 0.0 is defined, and the result depends on the dividend.
IEEE-754 floating point numbers are not the real numbers. Your misinterpretation of my statement does, however, point out that division by zero could be a poor choice for "simple example."
IEEE-754 floating point numbers are not the real numbers. Your misinterpretation of my statement does, however, point out that division by zero could be a poor choice for "simple example."
I was making a point -- and its a good one to make. Many people will not distinguish between real numbers and IEEE 754 floats since there are few alternative representations available.
These kinds of issues are not limited to floating point numbers. If you want people to accept the semantic requirements, you will need to find a way to present them that will not result in counterexamples that "obviously" break the system.
Out of curiosity, if IEEE 754 floats are not (models of?) the real numbers then what are they?
I was making a point -- and its a good one to make. Many people will not distinguish between real numbers and IEEE 754 floats since there are few alternative representations available.
Yes, that was exactly my meaning when I withdrew my suggestion that division by zero is the most familiar example of an operation that is not well-defined over all values. It's too easily subject to misinterpretation.
Out of curiosity, if IEEE 754 floats are not (models of?) the real numbers then what are they?
I think we're speaking past each other. I simply meant that division by zero is defined for IEEE-754 floats, but not for the real numbers.
Where does this discussion leave us? Does double
satisfy TotallyOrdered
? If yes, what do we say about outlier values like NaN? If no ... I'm not sure there is a no. It then becomes an error to pass a range of double
s to sort
, and that can't be right.
Where does this discussion leave us? Does double satisfy EqualityComparable? If yes, what do we say about outlier values like NaN? If no ... I'm not sure there is a no. It then becomes an error to pass a range of doubles to sort, and that can't be right.
Just say this:
Required operations are not, in general, total functions. That is, some arguments to a required operation may result in undefined behavior. [Example - Calling std::min with NaN as an argument, subtracting random access iterators from different containers, and overflowing a signed integer accumulator in `std::accumulate all result in undefined behavior]. This does not affect whether a type models the concept.
Do we use the word "model" anywhere? If not: "whether a type satisfies the semantic requirements of a concept".
Andrew
This is perfect, thanks Andrew.
@asutton I've tweaked the wording.
Required operations of any concept defined in this standard need not be total functions; that is, some arguments to a required operation may result in the required semantics failing to be satisfied. [ Example: The required
<
operator of theTotallyOrdered
concept (19.3.4) does not meet the semantic requirements of that concept when operating on NaNs.—end example ] This does not affect whether a type satisfies the concept.
We need a way of systematically documenting which operands would result in undefined behavior. We can do this on a concept-by-concept basis for built-in types and other general sets of invalid arguments (subtracting RA-iterators).
In Equality_comparable and Total_order, we should add to each:
For floating point types (3.9), the use of the NaN as an argument to any required operation results in undefined behavior.
In RA_iterator -- if we don't already say this:
The subtraction of random access iterators pointing to elements in different ranges is undefined.
User-defined types with their own restrictions need to document those (e.g., my class X is Eqaulity_comp except for the value X(0)). I don't know, off the type of my head, what library types we would have to do this for.
Andrew Sutton
On Thu, Sep 24, 2015 at 11:30 AM, Eric Niebler [email protected] wrote:
Closed #88 https://github.com/ericniebler/stl2/issues/88 via 468e662 https://github.com/ericniebler/stl2/commit/468e662c8e94a69998fd0e011174d5ac67309018 .
— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/88#event-418235383.
Wonderful!
Sent from my iPad
On Sep 23, 2015, at 1:45 PM, Andrew Sutton [email protected] wrote:
Where does this discussion leave us? Does double satisfy EqualityComparable? If yes, what do we say about outlier values like NaN? If no ... I'm not sure there is a no. It then becomes an error to pass a range of doubles to sort, and that can't be right.
Just say this:
Required operations are not, in general, total functions. That is, some arguments to a required operation may result in undefined behavior. [Example - Calling std::min with NaN as an argument, subtracting random access iterators from different containers, and overflowing a signed integer accumulator in `std::accumulate all result in undefined behavior]. This does not affect whether a type models the concept.
Do we use the word "model" anywhere? If not: "whether a type satisfies the semantic requirements of a concept".
Andrew — Reply to this email directly or view it on GitHub.
I think we want to talk about the domain of functions rather than saying "total function", which is not defined or used anywhere else in the standard. [17.6.4.9] of the current standard says this:
- Each of the following applies to all arguments to functions defined in the C++ standard library, unless explicitly stated otherwise. (1.1) — If an argument to a function has an invalid value (such as a value outside the domain of the function or a pointer invalid for its intended use), the behavior is undefined.
Also, care is taken already in the specification of InputIterator
and ForwardIterator
to describe the domain of ==
and !=
. [24.2.3/p2] says:
In Table 106, the term the domain of
==
is used in the ordinary mathematical sense to denote the set of values over which==
is (required to be) defined. This set can change over time. Each algorithm places additional requirements on the domain of==
for the iterator values it uses. These requirements can be inferred from the uses that algorithm makes of==
and!=
. [ Example: the callfind(a,b,x)
is defined only if the value ofa
has the propertyp
defined as follows:b
has propertyp
and a valuei
has propertyp
if(*i==x)
or if (*i!=x
and++i
has propertyp
). —end example ]
I'm not sure how practical it is to get this detailed about all specifying the domains of all the operations in all the concepts that aren't total.