creusot
creusot copied to clipboard
const generics and const expressions
Examples found in the wild:
impl<T> [T] {
// const generic
pub const fn as_chunks<const N: usize>(&self) -> (&[[T; N]], &[T])
}
pub trait SizedTypeProperties: Sized {
// associated const
const IS_ZST: bool;
}
const unsafe fn mod_inv(x: usize, m: usize) -> usize {
// const items
const INV_TABLE_MOD_16: [u8; 8] = [1, 11, 13, 7, 9, 3, 5, 15];
const INV_TABLE_MOD: usize = 16;
...
}
It would be awesome if not only we support them but that we can leverage them to capture the invariants of types like arrays where code like:
let x = [0; 10];
for i in 0..10 {
x[i] = 100
}
Should prove without additional assertions.
My current understanding is that const declarations (top-level and associated), const generics and const {...} expressions are all pretty straightforward in MIR so they are easy to translate into program functions.
The challenge is how to refer to const definitions in logic. We can make all goals that mention const into program functions which call the functions that compute those constants before asserting the desired goal. One tricky thing is that dependencies can also mention constants. I propose this plan:
- Every
const Ntranslates to (1) a program functionget_Nwhich contains the translated body of the const, and (2) aconstant cNwithout a body; - Dependencies can introduce lemmas that mention
cN; - In the goal, which is now always a program function, we call
get_N, assumecN = result_of_get_N, and assert the goal.
Because this encoding is a bit involved, we can switch to a simpler translation as just a constant N = ... in the simple and overhelmingly common case where the body is either a literal or a variable.
Complication: const definitions from the current crate are handled fine, but not those from dependencies, because our mir-to-fmir translation uses an BodyWithBorrowckFacts which comes from a LocalDefId.
- Why do we need borrow-checking information to produce FMIR?
- Can we still run the borrow-checker if we only have a
mir::Body(fromTyCtxt::instance_mir)?
Why do we need borrow-checking information to produce FMIR?
It's needed to calculate where borrow resolutions should be inserted (at least that was the original motivation).
In the goal, which is now always a program function, we call
get_N, assumecN = result_of_get_N, and assert the goal.
Why is the goal always a program function?
Can we still run the borrow-checker if we only have a
mir::Body(fromTyCtxt::instance_mir)?
I don't think this is possible with the current rustc API.
How arbitrary is MIR code generated from consts? Ideally, we would do like with promoteds (i.e., entirely skip resolve).
Why is the goal always a program function?
I am proposing to make it so in order to be able to call get_N.
In particular, when we generate a VC for a logic function, instead of making it a goal we make it a program function that calls any of those "constant getters" to assume the values of constants and then asserts the actual goal.
How arbitrary is MIR code generated from consts? Ideally, we would do like with promoteds (i.e., entirely skip resolve).
I believe it can be pretty arbitrary because loops and borrows are allowed, but the only examples I've ran into so far (and need this for) are basic arithmetic involving size_of.
I thought a bit about this, and I don't have a good solution.
I think the best option is to run borrow checking only once, and then store the result of the computation of Resolver in order to reuse it when needed. In order for this to work across crates, we need to serialize the result and store it in the Creusot metadata. Note that I think it is probably necessary to store/serialize the MIR code too, because we need to make sure that this is the exact same CFG that we get.
(Note that the information from the borrow checker is also used for performing some checks on the code, but this should also done only once, so we don't need to re-run the borrow checker for that.)
The follow-up question is "how should we store the result of the computation of Resolver"? I guess the simplest is to remember where do we decide to resolve which place.
This is quite a bit of a refactoring of the resolving machinery, but this is for good, because that would force us to do the translation of MIR to FMIR code in a different module as the one that comptes resolves. The current situation is not good, because the rather subtle computation of resolve placement is interleaved with the translation. If we pre-compute where to insert resolve statements before doing the translation, then they will be rather trivial to insert during translation.
So, I would suggest running a pre-computation pass before the translation, which computes placement of resolve statements (i.e., currently, these are calls to emit_resolve), and then use this information both in the main translation and in the translation used when "inlining" the body of a const. If I'm not mistaken, such resolve placement can always be described with 1- the description of a place (which should be substituted when inlining the body of a const) and 2- the location, which is either before a MIR statement or terminator or on one of the outgoing edges of a terminator (easy to represent with a dedicated type).