Use a later MIR
Today we use mir_built, but using mir_drops_elaborated_and_const_checked is required for https://github.com/AeneasVerif/charon/issues/152 and https://github.com/AeneasVerif/charon/issues/542. It is also likely required for https://github.com/AeneasVerif/charon/issues/306.
Here is a sketch of what needs to happen to solve this issue. External help is welcome here.
What I need to know is what happens if instead of mir_built we use mir_drops_elaborated_and_const_checked here: https://github.com/AeneasVerif/charon/blob/30cab88265206f4fa849736e704983e39a404d96/charon/src/bin/charon-driver/translate/get_mir.rs#L31
and then run the tests.
The first thing is that our tests are full of incorrect rust (because we don't run all the analyses so not all errors are reported); if that triggers rustc errors I'd welcome a PR that fixes them.
The second thing is that the translation may become incorrect (e.g. because we handle constants wrong) or panic. I'll need reproducers for these issues, or fixes if they're easy to fix.
The final step will be to identify changes that we don't want downstream tools to have to deal with. For example the elaborate_box_derefs pass transforms *b where b: Box<T> into something like *b.0.0.0 which accesses box internals; this is undesirable so we'll need to write a pass that undoes this. I don't know of others, also interested to get an overview of what the changes look like once the translation is correct.
For example the elaborate_box_derefs pass transforms *b where b: Box<T> into something like *b.0.0.0 which accesses box internals; this is undesirable so we'll need to write a pass that undoes this.
Why is this undesirable? Not having this means having to make Box a special case during analysis. I understand that in some cases one might want this to not happen, but I would make this pass optional/
Also, additional question: in the case of monomorphised MIR, am I correct to expect that most current limitations of Charon would be lifted? E.g. things like impl returns and generic associated types wouldn't be a problem since they would be erased before going to Charon
I would make this pass optional
Yes that pass would be optional. It is useful for tools targeting safe rust, for which there should be no raw pointers if there weren't any in their code.
in the case of monomorphised MIR, am I correct to expect that most current limitations of Charon would be lifted? E.g. things like impl returns and generic associated types wouldn't be a problem since they would be erased before going to Charon
That would solve a number limitations yes, that's what the majority of analysis tools do. Another big source of limitations is the proper handling of lifetimes, which isn't relevant if you don't need them. What would be your use-case?
Also note that I believe monomorphized MIR doesn't really exist, i.e. it's built on-the-fly alongside codegen and is not something we can easily retreive.
We have two projects, the one that requires Retag does not need lifetime information The second will need lifetime information though :( from my understanding there is now work done inside the rust compiler that does export this information, but not sure how easy to use
For example the elaborate_box_derefs pass transforms *b where b: Box into something like *b.0.0.0 which accesses box internals; this is undesirable so we'll need to write a pass that undoes this.
Why is this undesirable? Not having this means having to make Box a special case during analysis. I understand that in some cases one might want this to not happen, but I would make this pass optional/
Yes, the pass would definitely be optional. But at the same time tools like Aeneas really need not to see raw pointers in place of boxes, so we have to make sure we can reconstruct the boxes before validating a change of the MIR we target.
I don't think the target should change
My experience is probably that depending on the project you want some stage of Mir. For me the default should be mir_built and we should be able to ask for another one through a config flag.
I don't think the target should change My experience is probably that depending on the project you want some stage of Mir. For me the default should be
mir_builtand we should be able to ask for another one through a config flag.
Yes, it makes sense to allow the user to choose the stage of MIR they want and it is actually something we used to do at the beginning of the project; we stopped doing it because it revealed too much low level information we couldn't handle yet. As Charon has evolved a lot since then it is probably not an issue anymore, so maybe all we need to do to provide you with the features you need is to add a CLI option so that you can select the MIR stage you want?
On the side, all the tools which use Charon actually need some information such as the result of the drop elaboration which are only computed later in the compilation pipeline. This means we will have to target a later MIR stage at some point anyway, while pretending that we are actually targeting MIR built by reverting some passes like elaborate_box_derefs for the tools that do not want to see the low-level details. But we could do the easy step of adding the CLI option now and do the more complicated changes later. @Nadrieril what do you think?
from my understanding there is now work done inside the rust compiler that does export [lifetime] information, but not sure how easy to use
If there is such work, I am not aware of it. I've only heard of tool writers coming up with horrible hacks to get that info out of polonius.
@Nadrieril what do you think?
Agreed, I don't believe the stage of MIR is the right thing we want to tweak. Quite simply, any MIR before drop elaboration is lacking crucial information about the actual code that will be run. My plan is to only get the later MIR and provide options that reconstruct safe abstractions for tools that need them.
from my understanding there is now work done inside the rust compiler that does export [lifetime] information, but not sure how easy to use
If there is such work, I am not aware of it. I've only heard of tool writers coming up with horrible hacks to get that info out of polonius.
Are you thinking of https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/consumers/fn.get_body_with_borrowck_facts.html maybe?
Hm, get_body_with_borrowck_facts has existed for a while, I remember hearing that this wasn't sufficient. For example, this works only on promoted MIR, which is lacking elaborated drops. And I was recently chatting with Ralf about how to get useful lifetime info in runnable code (for Miri) and it seemed like that would take some work.
The idea we had was to add a MIR statement that indicates where a given lifetime starts/ends, the benefit being that the statements can be kept across MIR phases without effort. (And apparently such a statement existed in early versions of rust). I'm not aware that anyone is working on implementing this at the moment.
If there is such work, I am not aware of it. I've only heard of tool writers coming up with horrible hacks to get that info out of polonius.
Pinging @xldenis who was the tool writer coming up with the horrible hack when I needed that for Gillian-Rust.
I know that it is, in general, not enough though. We're are somewhat limited by the proofs we can do because of that
https://github.com/creusot-rs/creusot/blob/6546583679218c1841a824fb8fd547543496bfc0/creusot/src/callbacks.rs#L90-L108
This is what we use in creusot (MirBodyWithBorrowckFacts) what has changed is that last summer we exported the RegionInferenceContext and other helper data which allows us to make meaningful usage of the data the borrowck calculates.
@giltho the problem you're referring to is that we translate some specs from THIR which is before borrowchecking and thus doesn't have this corresponding information.
The hard parts of supporting --mir promoted are done, we should be able to use it by default. Once that's done we should investigate the next two MIRs (elaborated and optimized) to see if anything is unsupported.