const-eval icon indicating copy to clipboard operation
const-eval copied to clipboard

When exactly is a constant well formed?

Open lcnr opened this issue 3 years ago • 21 comments

So my current understanding is that all constants used in types must be well formed.

I wasn't able to find any discussion on what exactly that means though :sweat_smile:

I think that for concrete constants the following must hold:

Constants evaluate successfully

This means that all of 1usize - 2, panic!(), loop {} and [1, 2, 3][3] are not well formed as constants.

An evaluated constant must satisfy all lang invariants of its type

So something like unsafe { std::mem::transmute::<u8, bool>(2) } is not wf. I do think that the following would be well formed though, as it only breaks library invariants:

const V: &'static str = unsafe { std::mem::transmute::<&[u8], &str>(&[0xff, 0xff, 0xff, 0xff]) };

The more interesting case are polymorphic constants as we can't just evaluate them. For these I think the following rule is appropriate

A polymorphic constant is well formed, iff all possible concrete instances are well formed

Some examples:

use std::mem::size_of;

fn foo<T>() -> [u8; size_of::<T>()] {}
// well formed, as `size_of::<T>()` returns a wf constant no matter the type of `T`.

fn bar<T>() -> [u8; 64 / size_of::<T>()] {}
// not well formed, as `64 / size_of::<()>()` does not evaluate successfully and is therefore not wf.

struct Baz<const V: bool>;
fn baz<T>() -> Baz<{ unsafe { std::mem::transmute::<u8, bool>(size_of::<T>() as u8) }> {}
// not well formed, for any `T` with `size_of::<T>() > 3` the final value is not a valid bool. 

Does this definition make sense?

lcnr avatar Jul 29 '20 11:07 lcnr

cc @oli-obk @RalfJung @eddyb

lcnr avatar Jul 29 '20 11:07 lcnr

Nit:

fn baz<T>() -> [u8; unsafe { std::mem::transmute<u8, bool>(size_of::<T>()) }]
// not well formed, for any `T` with `size_of::<T>() > 3` the final value is not a valid bool. 

needs to be

fn baz<T>() -> [u8; unsafe { std::mem::transmute<u8, bool>(size_of::<T>() as u8) as usize }]
// not well formed, for any `T` with `size_of::<T>() > 3` the final value is not a valid bool. 

We currently do not detect this case because of performance reasons. It is trivial (a single flag) to turn on that locals are checked for validty after each modifiction

oli-obk avatar Jul 29 '20 12:07 oli-obk

We detect it in the final value of a constant afaict. So const N: bool = unsafe { std::mem::transmute::<u8, bool>(2) }; already causes a compile time error.

error[E0080]: it is undefined behavior to use this value
 --> src/lib.rs:1:1
  |
1 | const N: bool = unsafe { std::mem::transmute::<u8, bool>(2) };
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ type validation failed: encountered 0x02, but expected a boolean
  |
  = note: The rules on what exactly is undefined behavior aren't clear, so this check might be overzealous. Please open an issue on the rustc repository if you believe it should not be considered undefined behavior.

lcnr avatar Jul 29 '20 14:07 lcnr

Yes, but we don't detect it in intermediate computations like when it's stored in a local variable but never returned.

oli-obk avatar Jul 29 '20 14:07 oli-obk

This means that all of 1 - 2, panic!(), loop {} and [1, 2, 3][3] are not well formed as constants.

1 - 2 evaluates successfully for signed types. maybe you meant something like: 1 - 2usize?

programmerjake avatar Jul 29 '20 15:07 programmerjake

A polymorphic constant is well formed, iff all possible concrete instances are well formed

The problem with this definition is that it is uncheckable (except with major effort).

I thought the point of well-formedness is for the compiler to check it? Similar to how it checks that e.g. lifetimes appropriately outlive each other? If that is the case, we need a notion of well-formedness that is efficiently decidable -- which the one I quoted above is not.

RalfJung avatar Aug 05 '20 12:08 RalfJung

The problem with this definition is that it is uncheckable (except with major effort).

The way that it is proposed to check this is very cheap: If it is still generic, you need a 1:1 copy of the constant in a where bound (so exposing the fact that you need this constant to be evaluable to the caller), and then the caller either does the same, or (if it's monomorphic now), evaluates it. If it evaluates successfully, the bound is satisfied and the constant is well formed.

oli-obk avatar Aug 05 '20 13:08 oli-obk

So basically this system punts the question of polymorphic constants? We don't even define what well-formedness means for them, but we ensure that the context always assumes well-formedness?

I'm somewhat surprised this works, and it also seems very restrictive, but -- yeah, sounds plausible. I'd change the text then and not say things like "polymorphic constants are well-formed if all instances are well-formed". Such a statement says that e.g. this is well-formed:

fn bar<T>() -> [u8; 64 + align_of::<T>() * 0] {}

Or this:

fn bar<T>() -> [u8; 64 / std::min(size_of::<T>(), 1)] {}

But it's not well-formed, due to a lack of where clause. There is simply no rule one can use to prove a polymorphic const well-formed, which makes "forward assumption to caller" the only option.

RalfJung avatar Aug 05 '20 13:08 RalfJung

I don't see why "The problem with this definition is that it is uncheckable" is a problem in practice.

Just like the borrow checker prevents programs which are theoretically correct, so too can our const well formed check forbid theoretically correct problems.

So at least with how I understand this, both 64 + align_of::<T>() * 0 and 64 / std::min(size_of::<T>(), 1) are well formed, we just aren't currently able to prove this.

So we could theoretically add a potentially unsafe function fn std::constant::evaluatable_unchecked<T>(v: T) -> T { v } in the future, which allows the user to write const expressions we assume to be wf without checking it

lcnr avatar Aug 05 '20 13:08 lcnr

So basically this system punts the question of polymorphic constants?

yes

We don't even define what well-formedness means for them, but we ensure that the context always assumes well-formedness?

yes

I'm somewhat surprised this works, and it also seems very restrictive, but -- yeah, sounds plausible

it is very restrictive, but it's not actually worse than the status quo (typenum's trait bounds work exactly the same way). We can in the future add more general WF restrictions like X > 10 or sth, which will stop requiring WF bounds for X - 10, X - 9, ... but still require them for X - 11, but any such improvements can be added in a nonbreaking way later.

oli-obk avatar Aug 05 '20 13:08 oli-obk

So at least with how I understand this, both 64 + align_of::<T>() * 0 and 64 / std::min(size_of::<T>(), 1) are well formed, we just aren't currently able to prove this.

But then we need a different name for "the thing the compiler checks". So far, "well-formedness" was the thing that the compiler checked, not a more relaxed thing that we could theoretically also allow. This RFC does a good job of describing that thing (though it may be outdated). In the syntax of that RFC, there would simply be no rule that says when polymorphic consts are well-formed, which implicitly makes forwarding the requirement the only way.

If you suddenly use the term "well-formed" for something that is not what the compiler actually checks, that is going to be very confusing. Please don't change the meaning of established terms like that.


I thus propose that we say something like:

  • All constants are considered well-formed if they also appear in a where clause, thus forwarding the well-formedness requirements to the caller. (This is not specific to consts, it applies to all forms of well-formedness. We just repeat it here for clarity.)
  • Additionally, monorphic consts are well-formed if ...

Regarding the monomorphic case, shouldn't it also require that the constant must be convertible to a value tree?

RalfJung avatar Aug 05 '20 13:08 RalfJung

So we could theoretically add a potentially unsafe function fn std::constant::evaluatable_unchecked<T>(v: T) -> T { v } in the future, which allows the user to write const expressions we assume to be wf without checking it

I doubt we can do that, actually. Much of the compiler relies on well-formedness as an invariant, as far as I know. Such a method would likely trigger ICEs all over the place.

I don't think the theoretical possibility of re-architecting everything to allow such a method justifies what I consider to be a change in the meaning of the pre-existing term "well-formedness".

RalfJung avatar Aug 05 '20 13:08 RalfJung

Such a method would likely trigger ICEs all over the place.

Then it has to be unsafe, because it can trigger undefined behavior.

It seems like I also have to look at the currently used definitions for well-formed because it seems like we currently assign it a slightly different meaning.

lcnr avatar Aug 05 '20 13:08 lcnr

Then it has to be unsafe, because it can trigger undefined behavior.

ICEs have nothing to do with UB. In other words, even unsafe code must not be able to trigger ICEs during compilation.

RalfJung avatar Aug 05 '20 13:08 RalfJung

ICEs have nothing to do with UB. In other words, even unsafe code must not be able to trigger ICEs during compilation.

This seems like a weird restriction on what UB may do. I strongly disagree here.

lcnr avatar Aug 05 '20 13:08 lcnr

An ICE is a compiler bug, there is no input to the compiler that should ever trigger an ICE

oli-obk avatar Aug 05 '20 13:08 oli-obk

It's not a restriction of what UB can do. ICEs are entirely unrelated to UB. UB is about the output the compiler produces -- as in, the compiled program. It can behave arbitrarily when the source Rust program triggers UB. IOW, the compiler may produce an arbitrary binary when compiling an always-UB program.

But that is unrelated to crashing the compiler. Compiling a program that has UB is not itself UB! Only running it is.

Also this is off-topic.

RalfJung avatar Aug 05 '20 13:08 RalfJung

Summary: we can change what "well formed constant" means, but when talking about well formed constants, it must be whatever the compiler currently accepts as well formed. We don't talk about "stable" things when talking about things that "could be stable", either. Please let's get back to discussing what it means to have a "well formed constant" and explicitly mention when something is "we could change well formedness rules to accept this in the future".

oli-obk avatar Aug 05 '20 13:08 oli-obk

Ok, so from what I can tell, a given crate/type/whatever is well formed iff the language accepts it. IOW both things which are not well formed but are currently accepted and things which are well formed and not accepted are bugs of the compiler.

Is that correct?

At this point I am not actually thinking about "well formedness" here and need a better word for this :thinking:


So the "correct" definition for well formed contants would be:

Well formed constants are currently either concrete constants (with the above properties) or const params.

And after https://github.com/rust-lang/compiler-team/issues/340 is implemented, constants are also well formed if they are mentioned in the caller_bounds of a given type/function I guess?


What would [u8; 64 + align_of::<T>() * 0] be then? 64 + align_of::<T>() * 0 is evaluatable for all T :thinking: Would that mean that the thing I am trying to define is simply "generic const evaluatability"? i.e. 64 + align_of::<T>() * 0 would be const evaluatable, even if our current checks do not allow it?

lcnr avatar Aug 05 '20 13:08 lcnr

What would [u8; 64 + align_of::<T>() * 0] be then?

Not well formed, unless T is monomorphic. We could change well formedness rules to accept that, as it has a straight forward rule on the term tree (Note: not value tree! "term trees" are part of https://github.com/rust-lang/compiler-team/issues/340) for the constant. Performing such term tree simplifications should probably have its own RFC, as it's probably not too hard to come up with a corner case where it's not clear to users why one piece of code is accepted and another isn't.

oli-obk avatar Aug 05 '20 14:08 oli-obk

@lcnr

At this point I am not actually thinking about "well formedness" here and need a better word for this thinking

Regarding the terminology, this is very similar to things like "well-typed". A program is "well-typed" if it gets accepted by the compiler. The purpose of writing down a type system precisely and formally is to have a representation of that that's easier to read than 10k lines of code doing type-checking.

The following code is not well-typed, even though it could be (because nothing can ever go wrong when executing that function):

fn foo() -> i32 {
  if 0 == 1 { false } else { 13 }
}

So just like the above function is not well-typed even though a smarter compiler could accept it, we also consider types not well-formed based on what we actually can reasonably check and want to implement, not on what is theoretically possible.

For "well-typed", we say that functions like foo above that we could accept but do not are "sound" or "semantically (as opposed to syntactically) well-typed". I do not know of corresponding standard terminology for "well-formed", which is "one level meta" from "well-typed".

@oli-obk

What would [u8; 64 + align_of::() * 0] be then?

Not well formed, unless T is monomorphic.

Agreed.

We could change well formedness rules to accept that, as it has a straight forward rule on the term tree (Note: not value tree! "term trees" are part of rust-lang/compiler-team#340) for the constant. Performing such term tree simplifications should probably have its own RFC, as it's probably not too hard to come up with a corner case where it's not clear to users why one piece of code is accepted and another isn't.

But before we go down that rabbit hole, we should have some idea for where we are going. This is moving towards arbitrary powerful static analysis to determine if CTFE will succeed with every possible input. We could throw an SMT solver at this but we probably do not want to. I think it is better to accept no such code than to have a piecemeal solution that accepts some random things but not others, in an unpredictable way.

That's for way in the future anyways. I hope.^^

RalfJung avatar Aug 05 '20 16:08 RalfJung