charon icon indicating copy to clipboard operation
charon copied to clipboard

Use a later MIR

Open Nadrieril opened this issue 11 months ago • 16 comments

Today we use mir_built, but using mir_drops_elaborated_and_const_checked is required for https://github.com/AeneasVerif/charon/issues/152 and https://github.com/AeneasVerif/charon/issues/542. It is also likely required for https://github.com/AeneasVerif/charon/issues/306.

Nadrieril avatar Jan 31 '25 17:01 Nadrieril

Here is a sketch of what needs to happen to solve this issue. External help is welcome here.

What I need to know is what happens if instead of mir_built we use mir_drops_elaborated_and_const_checked here: https://github.com/AeneasVerif/charon/blob/30cab88265206f4fa849736e704983e39a404d96/charon/src/bin/charon-driver/translate/get_mir.rs#L31 and then run the tests.

The first thing is that our tests are full of incorrect rust (because we don't run all the analyses so not all errors are reported); if that triggers rustc errors I'd welcome a PR that fixes them.

The second thing is that the translation may become incorrect (e.g. because we handle constants wrong) or panic. I'll need reproducers for these issues, or fixes if they're easy to fix.

The final step will be to identify changes that we don't want downstream tools to have to deal with. For example the elaborate_box_derefs pass transforms *b where b: Box<T> into something like *b.0.0.0 which accesses box internals; this is undesirable so we'll need to write a pass that undoes this. I don't know of others, also interested to get an overview of what the changes look like once the translation is correct.

Nadrieril avatar Jan 31 '25 17:01 Nadrieril

For example the elaborate_box_derefs pass transforms *b where b: Box<T> into something like *b.0.0.0 which accesses box internals; this is undesirable so we'll need to write a pass that undoes this.

Why is this undesirable? Not having this means having to make Box a special case during analysis. I understand that in some cases one might want this to not happen, but I would make this pass optional/

giltho avatar Jan 31 '25 17:01 giltho

Also, additional question: in the case of monomorphised MIR, am I correct to expect that most current limitations of Charon would be lifted? E.g. things like impl returns and generic associated types wouldn't be a problem since they would be erased before going to Charon

giltho avatar Jan 31 '25 18:01 giltho

I would make this pass optional

Yes that pass would be optional. It is useful for tools targeting safe rust, for which there should be no raw pointers if there weren't any in their code.

in the case of monomorphised MIR, am I correct to expect that most current limitations of Charon would be lifted? E.g. things like impl returns and generic associated types wouldn't be a problem since they would be erased before going to Charon

That would solve a number limitations yes, that's what the majority of analysis tools do. Another big source of limitations is the proper handling of lifetimes, which isn't relevant if you don't need them. What would be your use-case?

Also note that I believe monomorphized MIR doesn't really exist, i.e. it's built on-the-fly alongside codegen and is not something we can easily retreive.

Nadrieril avatar Jan 31 '25 18:01 Nadrieril

We have two projects, the one that requires Retag does not need lifetime information The second will need lifetime information though :( from my understanding there is now work done inside the rust compiler that does export this information, but not sure how easy to use

giltho avatar Jan 31 '25 18:01 giltho

For example the elaborate_box_derefs pass transforms *b where b: Box into something like *b.0.0.0 which accesses box internals; this is undesirable so we'll need to write a pass that undoes this.

Why is this undesirable? Not having this means having to make Box a special case during analysis. I understand that in some cases one might want this to not happen, but I would make this pass optional/

Yes, the pass would definitely be optional. But at the same time tools like Aeneas really need not to see raw pointers in place of boxes, so we have to make sure we can reconstruct the boxes before validating a change of the MIR we target.

sonmarcho avatar Jan 31 '25 22:01 sonmarcho

I don't think the target should change My experience is probably that depending on the project you want some stage of Mir. For me the default should be mir_built and we should be able to ask for another one through a config flag.

giltho avatar Feb 01 '25 12:02 giltho

I don't think the target should change My experience is probably that depending on the project you want some stage of Mir. For me the default should be mir_built and we should be able to ask for another one through a config flag.

Yes, it makes sense to allow the user to choose the stage of MIR they want and it is actually something we used to do at the beginning of the project; we stopped doing it because it revealed too much low level information we couldn't handle yet. As Charon has evolved a lot since then it is probably not an issue anymore, so maybe all we need to do to provide you with the features you need is to add a CLI option so that you can select the MIR stage you want?

On the side, all the tools which use Charon actually need some information such as the result of the drop elaboration which are only computed later in the compilation pipeline. This means we will have to target a later MIR stage at some point anyway, while pretending that we are actually targeting MIR built by reverting some passes like elaborate_box_derefs for the tools that do not want to see the low-level details. But we could do the easy step of adding the CLI option now and do the more complicated changes later. @Nadrieril what do you think?

sonmarcho avatar Feb 03 '25 09:02 sonmarcho

from my understanding there is now work done inside the rust compiler that does export [lifetime] information, but not sure how easy to use

If there is such work, I am not aware of it. I've only heard of tool writers coming up with horrible hacks to get that info out of polonius.

@Nadrieril what do you think?

Agreed, I don't believe the stage of MIR is the right thing we want to tweak. Quite simply, any MIR before drop elaboration is lacking crucial information about the actual code that will be run. My plan is to only get the later MIR and provide options that reconstruct safe abstractions for tools that need them.

Nadrieril avatar Feb 03 '25 12:02 Nadrieril

from my understanding there is now work done inside the rust compiler that does export [lifetime] information, but not sure how easy to use

If there is such work, I am not aware of it. I've only heard of tool writers coming up with horrible hacks to get that info out of polonius.

Are you thinking of https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/consumers/fn.get_body_with_borrowck_facts.html maybe?

R1kM avatar Feb 03 '25 12:02 R1kM

Hm, get_body_with_borrowck_facts has existed for a while, I remember hearing that this wasn't sufficient. For example, this works only on promoted MIR, which is lacking elaborated drops. And I was recently chatting with Ralf about how to get useful lifetime info in runnable code (for Miri) and it seemed like that would take some work.

Nadrieril avatar Feb 03 '25 12:02 Nadrieril

The idea we had was to add a MIR statement that indicates where a given lifetime starts/ends, the benefit being that the statements can be kept across MIR phases without effort. (And apparently such a statement existed in early versions of rust). I'm not aware that anyone is working on implementing this at the moment.

Nadrieril avatar Feb 03 '25 13:02 Nadrieril

If there is such work, I am not aware of it. I've only heard of tool writers coming up with horrible hacks to get that info out of polonius.

Pinging @xldenis who was the tool writer coming up with the horrible hack when I needed that for Gillian-Rust.

giltho avatar Feb 03 '25 13:02 giltho

I know that it is, in general, not enough though. We're are somewhat limited by the proofs we can do because of that

giltho avatar Feb 03 '25 14:02 giltho

https://github.com/creusot-rs/creusot/blob/6546583679218c1841a824fb8fd547543496bfc0/creusot/src/callbacks.rs#L90-L108

This is what we use in creusot (MirBodyWithBorrowckFacts) what has changed is that last summer we exported the RegionInferenceContext and other helper data which allows us to make meaningful usage of the data the borrowck calculates.

@giltho the problem you're referring to is that we translate some specs from THIR which is before borrowchecking and thus doesn't have this corresponding information.

xldenis avatar Feb 03 '25 14:02 xldenis

The hard parts of supporting --mir promoted are done, we should be able to use it by default. Once that's done we should investigate the next two MIRs (elaborated and optimized) to see if anything is unsupported.

Nadrieril avatar Oct 22 '25 18:10 Nadrieril