creusot icon indicating copy to clipboard operation
creusot copied to clipboard

const generics and const expressions

Open Lysxia opened this issue 6 months ago • 9 comments

Examples found in the wild:

impl<T> [T] {
    // const generic
    pub const fn as_chunks<const N: usize>(&self) -> (&[[T; N]], &[T])
}

pub trait SizedTypeProperties: Sized {
    // associated const
    const IS_ZST: bool;
}

const unsafe fn mod_inv(x: usize, m: usize) -> usize {
    // const items
    const INV_TABLE_MOD_16: [u8; 8] = [1, 11, 13, 7, 9, 3, 5, 15];
    const INV_TABLE_MOD: usize = 16;
    ...
}

Lysxia avatar May 15 '25 18:05 Lysxia

It would be awesome if not only we support them but that we can leverage them to capture the invariants of types like arrays where code like:

let x = [0; 10];
for i in 0..10 {
  x[i] = 100
}

Should prove without additional assertions.

xldenis avatar May 15 '25 18:05 xldenis

My current understanding is that const declarations (top-level and associated), const generics and const {...} expressions are all pretty straightforward in MIR so they are easy to translate into program functions.

The challenge is how to refer to const definitions in logic. We can make all goals that mention const into program functions which call the functions that compute those constants before asserting the desired goal. One tricky thing is that dependencies can also mention constants. I propose this plan:

  1. Every const N translates to (1) a program function get_N which contains the translated body of the const, and (2) a constant cN without a body;
  2. Dependencies can introduce lemmas that mention cN;
  3. In the goal, which is now always a program function, we call get_N, assume cN = result_of_get_N, and assert the goal.

Because this encoding is a bit involved, we can switch to a simpler translation as just a constant N = ... in the simple and overhelmingly common case where the body is either a literal or a variable.

Lysxia avatar May 23 '25 20:05 Lysxia

Complication: const definitions from the current crate are handled fine, but not those from dependencies, because our mir-to-fmir translation uses an BodyWithBorrowckFacts which comes from a LocalDefId.

  • Why do we need borrow-checking information to produce FMIR?
  • Can we still run the borrow-checker if we only have a mir::Body (from TyCtxt::instance_mir)?

Lysxia avatar Jun 04 '25 09:06 Lysxia

Why do we need borrow-checking information to produce FMIR?

It's needed to calculate where borrow resolutions should be inserted (at least that was the original motivation).

xldenis avatar Jun 04 '25 11:06 xldenis

In the goal, which is now always a program function, we call get_N, assume cN = result_of_get_N, and assert the goal.

Why is the goal always a program function?

jhjourdan avatar Jun 04 '25 14:06 jhjourdan

Can we still run the borrow-checker if we only have a mir::Body (from TyCtxt::instance_mir)?

I don't think this is possible with the current rustc API.

jhjourdan avatar Jun 04 '25 14:06 jhjourdan

How arbitrary is MIR code generated from consts? Ideally, we would do like with promoteds (i.e., entirely skip resolve).

jhjourdan avatar Jun 04 '25 14:06 jhjourdan

Why is the goal always a program function?

I am proposing to make it so in order to be able to call get_N.

In particular, when we generate a VC for a logic function, instead of making it a goal we make it a program function that calls any of those "constant getters" to assume the values of constants and then asserts the actual goal.

How arbitrary is MIR code generated from consts? Ideally, we would do like with promoteds (i.e., entirely skip resolve).

I believe it can be pretty arbitrary because loops and borrows are allowed, but the only examples I've ran into so far (and need this for) are basic arithmetic involving size_of.

Lysxia avatar Jun 04 '25 14:06 Lysxia

I thought a bit about this, and I don't have a good solution.

I think the best option is to run borrow checking only once, and then store the result of the computation of Resolver in order to reuse it when needed. In order for this to work across crates, we need to serialize the result and store it in the Creusot metadata. Note that I think it is probably necessary to store/serialize the MIR code too, because we need to make sure that this is the exact same CFG that we get.

(Note that the information from the borrow checker is also used for performing some checks on the code, but this should also done only once, so we don't need to re-run the borrow checker for that.)

The follow-up question is "how should we store the result of the computation of Resolver"? I guess the simplest is to remember where do we decide to resolve which place.

This is quite a bit of a refactoring of the resolving machinery, but this is for good, because that would force us to do the translation of MIR to FMIR code in a different module as the one that comptes resolves. The current situation is not good, because the rather subtle computation of resolve placement is interleaved with the translation. If we pre-compute where to insert resolve statements before doing the translation, then they will be rather trivial to insert during translation.

So, I would suggest running a pre-computation pass before the translation, which computes placement of resolve statements (i.e., currently, these are calls to emit_resolve), and then use this information both in the main translation and in the translation used when "inlining" the body of a const. If I'm not mistaken, such resolve placement can always be described with 1- the description of a place (which should be substituted when inlining the body of a const) and 2- the location, which is either before a MIR statement or terminator or on one of the outgoing edges of a terminator (easy to represent with a dedicated type).

jhjourdan avatar Jun 05 '25 13:06 jhjourdan