rfcs MaybeDangling

MaybeDangling

Open RalfJung opened this issue 1 year ago • 40 comments

Declare that references and Box inside a new MaybeDangling type do not need to satisfy any memory-dependent validity properties (such as dereferenceable and noalias).

Rendered

Thanks to @rust-lang/wg-unsafe-code-guidelines for a lot of helpful feedback :)

Oct 29 '22 14:10 RalfJung

Does MaybeDangling have any impact on lifetimes? I assume not, but it's unfortunately close to #[may_dangle] (the dropck eyepatch) which does.

Some naming alternatives

Only slightly tongue-in-cheek:

Dormant, Inactive, Suspended, etc — the point is to turn off enforcement of the type's "special" memory related properties. Validity is still required, but I think this communicates the intent of only enforcing the "special" properties lazily when used.

These names might much more strongly want access to the wrapped type to be explicit (but still safe) rather than through Deref, though, as they're much more "active" in what they're saying.
Launder — like C++'s std::launder, it's about hiding properties from the compiler, and like std::launder, it's a very necessary tool that nonetheless is only needed in a small set of low level building blocks.

These names also have the advantage of not implying they only impact dangling/dereferenceable but not uniqueness/noalias.

I also find it somewhat amusing that this kind of ends up introducing a new pointer type. MaybeDangling<&'a [mut] T> offers some interesting opportunities to replace some NonNull<T> usage.

Oct 30 '22 06:10 CAD97

Does MaybeDangling have any impact on lifetimes? I assume not, but it's unfortunately close to #[may_dangle] (the dropck eyepatch) which does.

Oh... yeah that is not great. :/

I also find it somewhat amusing that this kind of ends up introducing a new pointer type. MaybeDangling<&'a [mut] T> offers some interesting opportunities to replace some NonNull<T> usage.

This is still quite different from NonNull. MaybeDangling<&mut T> is not Copy and each time you use it, that will create a mutable reference (via the DerefMut implementation) which comes with all its usual aliasing gurantees!

NonNull is suited for actually aliasing data. MaybeDangling<&mut T> is suited for cases where the reference is "on hold" and aliases exist but the reference will only be used again once the aliases are gone.

Dormant, Inactive, Suspended, etc — the point is to turn off enforcement of the type's "special" memory related properties. Validity is still required, but I think this communicates the intent of only enforcing the "special" properties lazily when used.

I like those, they match the previous sentence about being "on hold".

These names might much more strongly want access to the wrapped type to be explicit (but still safe) rather than through Deref, though, as they're much more "active" in what they're saying.

Hm... yes maybe? No strong opinion here; I assumed it'd be Deref because ManuallyDrop is.

Oct 31 '22 10:10 RalfJung

These names might much more strongly want access to the wrapped type to be explicit (but still safe) rather than through Deref, though, as they're much more "active" in what they're saying.

Hm... yes maybe? No strong opinion here; I assumed it'd be Deref because ManuallyDrop is.

This is a good observation, because the documentation of Deref says the following:

Implementing Deref for smart pointers makes accessing the data behind them convenient, which is why they implement Deref. On the other hand, the rules regarding Deref and DerefMut were designed specifically to accommodate smart pointers. Because of this, Deref should only be implemented for smart pointers to avoid confusion.

Note that ManuallyDrop is not always wrapped around a smart pointer! Looking back at its RFC this was not brought up then. MaybeDangling<..> does have a use case for wrapping Box<T> and &mut T, so here it less in conflict with the documentation of Deref, but still it could wrap a non-pointer type (this is not of any practical use, though).

Should MaybeDangling implement Deref and DerefMut like ManuallyDrop does, or should accessing the inner data be more explicit since that is when the aliasing and validity requirements do come back in full force?

As mentioned in the RFC, the difference between ManuallyDrop and MaybeDangling is that the latter is intended for maybe-aliasing capabilities, in addition to both supporting maybe-dangling properties. This maybe-aliasing property being a potential mild footgun for safe Rust code implicitly dereferencing an aliasing MaybeDangling, combined with the documentation of Deref, seem to suggest that MaybeDangling might be better off not implementing Deref and expose explicit methods for that instead. Or as an alternative, it could only implement Deref<Target=T::Target> when T also implements Deref to stay as close to the documentation of Deref as possible, but this could also hurt composability of MaybeDangling<T> and does not remove the footgun, so that is not ideal either. Finally, whether MaybeDangling implements Deref or not would also have consequences for whether its methods could be implemented as regular methods or as associated functions, so there is a UX aspect to this, as well.

I also don't have a strong opinion here, but it seemed worthwhile to at least consider this for the decision for whether to implement Deref or not :)

Nov 02 '22 07:11 Pointerbender

MaybeDangling<&mut T> is not Copy

To clarify, is MaybeDangling<T> Copy if T is?

Nov 02 '22 07:11 comex

To clarify, is MaybeDangling<T> Copy if T is?

Yes, it propagates all the auto traits and Copy. It should probably derive(Copy, Clone, Debug) and maybe more.

Nov 02 '22 08:11 RalfJung

(@Pointerbender FWIW the "for smart pointers" wording has been considered for removal a few times before, and still is. The general consensus is more along the lines of "when &Self should be implicitly usable as &Target" plus guidelines around avoiding method conflicts. In addition to ManuallyDrop, std also has the unstable LazyLock/LazyCell which aren't pointers persay but do implement Deref.)

Nov 02 '22 17:11 CAD97

A potential future possibility with this RFC is to use it to resolve the self-referential-generator-aliasing situation: these generators need to be pinned anyway, and if we define Pin<T> as a newtype around Pin<MaybeDangling<T>>, then Pin<&mut T> no longer makes any aliasing assumptions so we wouldn't need anything else (such as the current special treatment of Unpin) to make self-referential generators sound.

Nov 05 '22 14:11 RalfJung

That's not fully sufficient by itself; you still need some way to do pin projection. Currently that's done with the _unchecked methods which require using the actual &mut.

Nov 05 '22 18:11 CAD97

Hm... yeah you are right, those &mut are short-lived but they do add intermediate noalias assumptions.

Nov 05 '22 19:11 RalfJung

Do the effects of MaybeDangling also recurse into the pointee and/or pointee's fields? E.g. will MaybeDangling<Box<Box<&mut T>>> also strip the &mut T from its memory-dependent validity obligations, or would that require a MaybeDangling<Box<Box<MaybeDangling<&mut T>>>> to that effect?

Nov 06 '22 15:11 Pointerbender

A Box<Box<&mut T>> doesn't have aliasing requirements on the inner pointers to begin with, so there's nothing there for MaybeDangling to strip.

Nov 06 '22 15:11 RalfJung

Perhaps making ManuallyDrop more magical would suffice?

It makes sense to me that an object with dangling references is dropped manually, because user of that type already has to be careful about when it is dangling (e.g. drop order of fields in self-referential structs is important).

In the worst case it should be possible to make a DIY MaybeDangling with AutomaticallyDrop<T>(ManuallyDrop<T>) with a Drop.

Nov 07 '22 14:11 kornelski

The alternative is mentioned in the RFC. The RFC also has 2 use-cases for MaybeDangling that want automatic drop, so it seems better to have ManuallyDrop use a MaybeDangling type rather than awkwardly implementing a dropping MaybeDangling on top of ManuallyDrop.

Nov 07 '22 14:11 RalfJung

I mean it may be better if these two other cases deal with awkwardness of manual drop than the language as a whole get yet another wrapper type.

If dealing with unnecessary ManualDrop turns out to be too error-prone, another wrapper can always be added (even in user code). But if this feature starts as another wrapper, it can't be taken back.

Nov 07 '22 15:11 kornelski

I can't see how a wrapper type that needlessly combines two features is better than its orthogonal decomposition into independent types.

What I implied above was that the case of wanting MaybeDangling with drop is common enough that we probably want this type in the standard library (the stdlib already has a consumer for such a type), so I don't think anything is gained.

Nov 07 '22 15:11 RalfJung

MaybeDangling as currently specified by the RFC does not change safety invariants, as it implements safe Deref conversion like ManuallyDrop does. I argue that this is likely undesirable, and MaybeDangling should instead make it unsafe to reactivate the wrapped type and its invariants.

I think that would make MaybeDangling mostly useless -- one can already use MaybeUninit for something like that. I consider not having to use unsafe code to get the inner data back out of the no-dangling wrapper to be a feature. For example, this would have avoided adding unsafe code in https://github.com/rust-lang/rust/pull/102589.

Jan 02 '23 13:01 RalfJung

I think at some point we'll need a cheatsheet of which wrapper disables what.

Feb 18 '23 17:02 the8472

Huh, the similarities between this proposal and #3390 are... interesting, to say the least. Tho it doesn't seem like anything would cause a bad interaction between them. Neat!

(Tho on that note, while #3390 would require turning off dereferenceable for specific references, it wouldn't actually be able to do that for a specific Box, so maybe MaybeDangling is still useful?)

Feb 24 '23 19:02 SoniEx2

From the discussion at https://github.com/rust-lang/rfcs/pull/3467#issuecomment-1675843159:

The relationship with MaybeDangling, yoke and stable-deref has me a bit more confounding. It seems wrong neither MaybeDangling nor UnsafeAliased are strictly more powerful than the other. MaybeDangling removes both dereferenceable and noalias, but only to 1 level of pointers, whereas UnsafeAliased removed noalias to arbitrary depth. It seems to me that stablederef actually only wants to remove noalias to a depth of 1, and doesn't actually need the value to dangle? I may be confused, but it seems like there's something wrong in the taxonomy here.

It is true that we could have a type that keeps (some form of) dereferenceable but drops noalias. In yoke specifically, the cart can keep its usual dereferencability requirements. However, the point of dereferenceable is to let the compiler introduce spurious reads. But taking yoke as the example, do we really want spurious reads from the cart? I don't think so. They should only occur after into_backing_cart, at which point the cart type is back with all its usual dereferencability requirements.

What else could there be? I guess for the yoke it could make sense to say that this is dereferenceable-on-function-entry and it is noalias for as long as it is live. In other words, it would be totally fine to do the usual retagging on the yoke, we just don't want the protector. Basically the problem is that the 'static lifetime of the reference in the yoke is a lie -- the reference might end up living much shorter than that. We could have a type MaybeDoesntOutliveFunction<T> that means "still retag all references in T as usual but don't add any protectors -- the references must be live right now but we can't be sure they will be live for the entire function call". Alternatively maybe we could have a special magic lifetime that yoke could use instead of 'static to indicate this.

However, for LLVM we'd still have to remove noalias and dereferencable for such a MaybeDoesntOutliveFunction type. So maybe we should get back to that idea once we have some optimizations that could actually make use of this.

What remains is that the name MaybeDangling is inadequate for a bunch of the uses of this type, e.g. the uses in yoke. I totally agree. If someone has a better name I'll happily take it. :) @CAD97 suggested something along the lines of Inert, since the data in a MaybeDangling is basically treated as raw bits but its "effects in memory" are disabled. However the bits must still be valid at the given type, otherwise this would be MaybeUninit...

Aug 15 '23 12:08 RalfJung

(NOT A CONTRIBUTION)

I'm not convinced that making yoke sound is the best approach to solving the problem of "self-borrowing." I think yoke and rental and so on are unsatisfactory APIs, and if the Rust project actually wants to support this pattern, it should think about how to solve the problem directly without wonky libraries (e.g. exploring existential lifetimes more seriously).

If the Rust project isn't that bothered about supporting this pattern (which I think is also a reasonable approach: I personally never wish for self-borrowing and have the sense this is something people want when they are not used to working with ownership), I think requiring them to define their own owning types that use raw pointers internally and don't generate the same aliasing and dereferencing guarantees as Box is also a fine answer. Honestly, I think just defining their own box-like type would probably produce a more comprehendible API than yoke has now.

That is to say, I think making yoke sound is an unsatisfactory middle road between endorsement and dis-endorsement of the self-borrowing pattern, and the Rust project should ultimately choose a lane.

EDIT: To be clear, I'm not saying I hold strongly that yoke should not be endorsed, more that I think both full language support and saying they can't use Box are also viable alternatives and possibly better ones, and we shouldn't proceed from the point of view that because yoke exists the project has to make it allowable.

In contrast, I see the proliferation of wrapper types which disable LLVM attributes in specific, nuanced ways as a real downside to this RFC. It does not seem ideal to me to require writers of unsafe to understand the difference between this and UnsafeAliased and MaybeUninit and UnsafeCell and ManuallyDrop so on, and as few of these types as Rust can require as possible is an important goal.

There are the other two motivations listed in this RFC, and so to me that most pertinent question whether these motivations require the introduction of the new type. Of course, the first motivation does not - you can just add this behavior to ManuallyDrop without adding a new type. We're left with the second motivation. Basically, as I understand it, without this API the closure needs to be wrapped in ManuallyDrop and then manually dropped (which is a bit of unsafe code). Isn't that it?

So if we were to ignore the yoke motivation, the advantage of introducing MaybeDangling over just changing ManuallyDrop is to lose a bit of unsafe code in a very particular and rather rare code pattern. I don't think this outweighs the confusion and likely unsound code that will result from people misusing this type when they actually need something else.

Aug 17 '23 12:08 withoutboats

I personally never wish for self-borrowing and have the sense this is something people want when they are not used to working with ownership

@withoutboats It's extremely dismissive to say "anybody who wants this just isn't used to dealing with lifetimes". That doesn't meaningfully add to the conversation and is demonstrably untrue given the experience of some of the folks involved here

Aug 17 '23 15:08 sgrif

(NOT A CONTRIBUTION)

@sgrif What you put in quotes is not what I wrote. I don't understand why you reframe what I wrote as something more dismissive and then criticize me for being dismissive? This feels disingenuous of you and I don't appreciate it.

What I said that self-borrowing is something people want when they are not used to working with ownership; I didn't say everyone who ever wants self-borrowing is not used to dealing with ownership, this is demonstrably untrue. But my experience is that the request for this kind of functionality is very often encountered when people haven't internalized the ownership/borrowing rules of Rust, after which there is a point at which people begin to preemptively structure their code to avoid this. Of course, there are cases where this is still not possible, in which case some sort of unsafe code is needed, hence the existence of these crates.

But this may lead the project to conclude that its not a pattern that needs a strong level of support from the language and standard library, and so accepting that libraries that want to provide this can't use Box could be a viable choice.

Aug 17 '23 15:08 withoutboats

personally never wish for self-borrowing and have the sense this is something people want when they are not used to working with ownership

As the author of yoke I feel it is motivated pretty well by real world use cases that don't actually have much to do with "lifetimes hard". It also solves that problem, but the primary motivation is not just cleaning up lifetimes, it's stuff that is legitimately really hard to do well in a generic way that is reusable across a large codebase.

What I said that self-borrowing is something people want when they are not used to working with ownership; I didn't say everyone who ever wants self-borrowing is not used to dealing with ownership, this is demonstrably untrue

I think this statement is perfectly valid but I think it makes your previous one irrelevant to the discussion when said in the same breath as questioning whether the project ought to support these use cases. Either you think the use case is valid and "people might want to use this for the wrong reason" is at best an argument for being careful about this, or you think the use case is invalid and you're saying that everyone wants to use this for the wrong reason. Which you aren't.

So ... I think it is quite reasonable for others to interpret your statement as being universal. I was certainly quite surprised to see you saying this.

I think yoke and rental and so on are unsatisfactory APIs

I will mention: yoke is slightly different from the rest in that it does not allow arbitrary self-borrowing, but it's more narrowly a lifetime erasure tool. It doesn't have exactly the same problems.

Honestly, I think just defining their own box-like type would probably produce a more comprehendible API than yoke has now

This would not work since the goal is for it to be used with a lot of different kinds of collections, as well as enums containing different collections (though you cannot safely construct those directly with the yoking methods).

So if we were to ignore the yoke motivation, the advantage of introducing MaybeDangling over just changing ManuallyDrop is to lose a bit of unsafe code in a very particular and rather rare code pattern

I mean, my understanding is that this would be fine for yoke as well[^1]. Personally not strongly opposed to it, though I think that composable safety abstractions are both easier to use and to review than omnibus ones where people will want to fiddle with the exact invariants.

Like honestly the reason I'm excited about this is more because it'll make unsafe code easier to review.

[^1]: Hell, we've already made it mostly work with MaybeUninit, strictly speaking we don't need this RFC. We haven't fixed the cart issue yet because there are a couple different ways to do it and some impact APIs and I have to make a call on that.

Aug 17 '23 15:08 Manishearth

(NOT A CONTRIBUTION)

I think this statement is perfectly valid but I think it makes your previous one irrelevant to the discussion when said in the same breath as questioning whether the project ought to support these use cases. Either you think the use case is valid and "people might want to use this for the wrong reason" is at best an argument for being careful about this, or you think the use case is invalid and you're saying that everyone wants to use this for the wrong reason. Which you aren't.

This would not work since the goal is for it to be used with a lot of different kinds of collections, as well as enums containing different collections (though you cannot safely construct those directly with the yoking methods).

There are all kinds of pointer usage patterns that Rust doesn't support well: graph data structures, garbage collection, etc. It is actually reasonable to decide that it's not worth the additional API complexity to support certain patterns well without saying that those use cases are "invalid." This a trade off between pros and cons in a hyperspace, it's not a binary question.

My side comment was just that I see a lot of people ask for this when they just haven't "gotten" that you can't have a 'self reference and that when they "get" that they stop asking for this so much, and this is a signal about how much weight should be put on supporting this pattern. My opinion is just that I think both the less and more alternative are viable: the Rust project could take this seriously enough to give it the API people actually want (i.e. investigate syntax and type checking for existential lifetime variables) or it could just say that its not supported.

But if you can get away with MaybeUninit to make yoke sound this discussion is not so central and it becomes more of this last question from your comment:

Personally not strongly opposed to it, though I think that composable safety abstractions are both easier to use and to review than omnibus ones where people will want to fiddle with the exact invariants.

I've expressed a concern about the proliferation of different wrappers with nuanced differences in which invariants they relax. I think the difference between this description and yours is a question of whether we think the taxonomy of these types and how they work is clear and easy to explain. I think this requires also a slightly more holistic approach than just examining this RFC and I would want to think more before writing more.

Aug 17 '23 16:08 withoutboats

There are all kinds of pointer usage patterns that Rust doesn't support well: graph data structures, garbage collection, etc. It is actually reasonable to decide that it's not worth the additional API complexity to support certain patterns well without saying that those use cases are "invalid." This a trade off between pros and cons in a hyperspace, it's not a binary question.

side-note: we're confused. we thought rust supports those quite well? maybe we're holding it wrong tho.

Aug 17 '23 16:08 SoniEx2

I think this requires also a slightly more holistic approach than just examining this RFC and I would want to think more before writing more.

I'm not opposed to that. Though I think yoke isn't the only motivator here and I think this RFC is a long time coming.

Especially as the current strategy is that miri is both the gold standard for checking UB and also implements checks that currently cannot be worked around for certain kinds of code that's doing stuff that ought to be valid: it makes a lot of sense to do this incrementally just because you want your unsafe code ecosystem to be able to use the unsafe checking tool as much as possible.

Aug 17 '23 16:08 Manishearth

I've expressed a concern about the proliferation of different wrappers with nuanced differences in which invariants they relax. I think the difference between this description and yours is a question of whether we think the taxonomy of these types and how they work is clear and easy to explain. I think this requires also a slightly more holistic approach than just examining this RFC and I would want to think more before writing more.

In my opinion:

When invariants can't be checked, then a proliferation of wrappers is really bad because users are likely to use the wrong wrapper.

When invariants can be checked, then a proliferation of wrappers is good, because it helps users understand which invariant exactly they need to relax, and therefore why a wrapper is needed in the first place. If multiple opt-outs are combined in the same feature, then that feature can end up perceived as a voodoo "sprinkle this to remove the UB" sort of thing… and then misapplied in situations where it actually doesn't remove the UB.

Aug 19 '23 00:08 comex

FWIW I don't have a strong preference between the RFC as-written, and just giving ManuallyDrop this special behavior. I think we certainly want ManuallyDrop to have this special behavior, so that a MaybeUninit<Box<_>> is allowed to dangle. (Or maybe we allow all Box to dangle.)

The pattern of having to move some data into a closure in an opaque way without knowing the exact type of the data and without making any promises about references outliving the entire closure does arise consistently, and MaybeDangling seems like a much clearer signal for such code than a ManuallyDrop + remembering to (unsafely) call drop.

If we want to for now just make ManuallyDrop special, adjust Miri accordingly, and start spreading that as the intended way of dealing with these situations -- I could live with that. Adding a stand-alone MaybeDangling without drop semantics in the future always remains a possibility.

Aug 19 '23 15:08 RalfJung

In a t-opsem meeting today, @digama0 made the interesting proposal of MaybeDangling being an attribute instead of a type. That could help fight the plethora of (non-commuting) wrapper types, and it also avoids losing all of the box magic on MaybeDangling<Box<T>> (that magic only works when having direct access to the field, but the MaybeDangling would be private, so we couldn't e.g. partially move out of it or disjointly borrow distinct fields).

I think I quite like that proposal, though it is a departure from how we have designed such features so far. UnsafeCell and UnsafeAliased/UnsafePinned were mentioned as other potential candidates for being an attribute.

Sep 12 '23 17:09 RalfJung

Nominating for t-lang discussion; I have incorporated the feedback from the design meeting.

EDIT: Oh, it already was nominated.

Oct 20 '23 06:10 RalfJung

rfcs rfcs copied to clipboard

MaybeDangling

rfcs
rfcs copied to clipboard