rfcs RFC: Unsafe Lifetime

RFC: Unsafe Lifetime

Open maboesanman opened this issue 2 years ago • 40 comments

Introduce a new special lifetime 'unsafe which is outlived by all other lifetimes. Using a type through a 'unsafe reference, or which is instantiated with an 'unsafe lifetime parameter is rarely possible without unsafe.

RENDERED

Nov 25 '21 01:11 maboesanman

Bikeshedding: it might require a new edition to make this work fully, but why not just call it 'unsafe if this is supposed to be an unsafe lifetime? Before a new edition we could just say that an explicitly defined 'unsafe lifetime would shadow this.

Nov 25 '21 03:11 clarfonthey

Bikeshedding: it might require a new edition to make this work fully, but why not just call it 'unsafe if this is supposed to be an unsafe lifetime? Before a new edition we could just say that an explicitly defined 'unsafe lifetime would shadow this.

I considered 'unsafe, but it isn't actually unsafe to have one of these references and store it. Using it is what breaks down to unsafe.

I picked ? Because it wouldn't require an edition boundary (I think), especially since the next one is 2024. Are there any reserved lifetime names that this feature could claim?

Nov 25 '21 03:11 maboesanman

As I said, I think we can get away with it working without an edition bump if we just require that it not be defined in the lifetime parameters; i.e. pub struct Struct<'unsafe> would mean that 'unsafe refers to the user-defined lifetime, and pub struct Struct<> would mean that 'unsafe refers to the RFC-defined lifetime.

The main benefit of an edition bump is that it would become a compile error on the future edition to shadow the lifetime, just like how you can't define your own lifetime named 'static right now. Prior to an edition bump we could probably have a deny-by-default lint for it.

Personally, I think that using a keyword would be a bit more clear, as the '? syntax seems weird to me and too similar to '_. I personally think that calling it unsafe means that it's unsafe to dereference, not unsafe to have, but I guess that I'll defer to what everyone else thinks.

Nov 25 '21 03:11 clarfonthey

The problem is you could have a function with an unchecked lifetime in the signature, that would collide with an existing function that names it's lifetime 'unchecked.

Nov 25 '21 03:11 maboesanman

see also #1918 (postponed) for the previous attempt at 'unsafe

Nov 25 '21 04:11 kennytm

reading through this I don't see why it isn't the same as '_ in all the explanations

Nov 25 '21 08:11 fbstj

reading through this I don't see why it isn't the same as '_ in all the explanations

You can't use '_ in a field of a struct, which is where most of the value of this comes from. In fact '_ outlives '? because '_ is always resolved to a lifetime.

Nov 25 '21 12:11 maboesanman

Bikeshedding: it might require a new edition to make this work fully, but why not just call it 'unsafe if this is supposed to be an unsafe lifetime? Before a new edition we could just say that an explicitly defined 'unsafe lifetime would shadow this.

Turns out keyword names aren't allowed in lifetimes anyway, so 'unsafe is just usable. I think I will edit the rfc to use 'unsafe instead, as it's reserved and you've convinced me it's clearer.

Nov 25 '21 12:11 maboesanman

see also #1918 (postponed) for the previous attempt at 'unsafe

I hadn't seen this rfc. I will go into some detail on this in prior art because it shares a lot with what I'm trying to do but I think this rfc is a little more precise in the approach.

Nov 25 '21 12:11 maboesanman

I'd assume niko's comment https://github.com/rust-lang/rfcs/pull/1918 still applies today, so anything like this should be many years away.

It's breaks abstraction boundaries if people instantiate foreign types' lifetime bounds with 'unsafe, which becomes problematic.

Instead, we'd need a default bound 'a : 'safe or 'a : !'unsafe on every lifetime 'a, like the T: Sized default now, but then specific uses could opt out by writing explicitly larger bounds like where 'a : ?'safe or where 'a : 'unsafe or where 'a : ?!'unsafe, ala T: ?Sized now. And lifetime elision always forbids 'unsafe in particular.

We'd need the libs team to figure out which core/std components should be changed from the default bound of 'a : 'safe to the weaker 'a : ?'safe.

We could plausibly treat unsafe more like an adjective, ala 'unsafe a, yielding some local lifetimes that still obey some rules and never escape, not sure if this solves the underlying problem, but maybe it helps while avoiding the default bounds. I suppose #1918 helps answer this question.

Nov 25 '21 12:11 burdges

It's breaks abstraction boundaries if people instantiate foreign types' lifetime bounds with 'unsafe, which becomes problematic.

This is the crucial difference between these two RFCs. 'unsafe cannot be used in place of another lifetime in function signatures because it is shorter than any lifetime, so no function that is expecting a normal lifetime can be called with 'unsafe instead. If you want to call that function you must transmute into a real lifetime, which is unsafe.

Nov 25 '21 13:11 maboesanman

Interesting, I'd missed this aspect, thanks. If I understand, &'unsafe T could equally be called *aligned const T, no? What is the *mut T analog of &'unsafe mut T?

Nov 25 '21 14:11 burdges

It's breaks abstraction boundaries if people instantiate foreign types' lifetime bounds with 'unsafe, which becomes problematic.

This is the crucial difference between these two RFCs. 'unsafe cannot be used in place of another lifetime in function signatures because it is shorter than any lifetime, so no function that is expecting a normal lifetime can be called with 'unsafe instead. If you want to call that function you must transmute into a real lifetime, which is unsafe.

I want to understand this. So let's say we have

fn foo<'a>(value: &'a Value) { ... }

By definition of generic lifetimes, this function should accept all lifetimes including 'unsafe? Correct? Then it should be able to be passed a value with 'unsafe lifetime.

But then, the normal form of functions

fn foo(value: &Value) { ... }

is just de-sugaring to the former.

Dec 02 '21 21:12 earthengine

By definition of generic lifetimes, this function should accept all lifetimes including 'unsafe? Correct? Then it should be able to be passed a value with 'unsafe lifetime.

because 'unsafe is shorter than any lifetime, a function which is generic over some lifetime parameter expects something which could be assigned a lifetime parameter.

'static >= 'a > 'unsafe

for all values of 'a. if instead we had 'a >= 'unsafe, you would be correct.

Dec 02 '21 21:12 maboesanman

@maboesanman I think it becomes obvious that if the proposed unsafe lifetime would be landed, you'd spent much effort in explaining it to the community.

Just as you commented above that unsafe lifetime is shorter than any lifetime, then does it mean it is never "alive"?

Dec 02 '21 22:12 lebensterben

Just as you commented above that unsafe lifetime is shorter than any lifetime, then does it mean it is never "alive"?

It is never alive in the same way that *const T is never alive. It is up to the user to guarantee it is alive at the time of use. The purpose of this lifetime is to be able to store types about whose lifetimes you the programmer can reason about but the borrow checker is unable to. It is intended as an advanced feature.

Dec 02 '21 23:12 maboesanman

This seems self-contradictory. The RFC has this example:

struct A<T> {
    item: T
    borrower: B<'?, T> // we want the ref inside this to refer to item
}

struct B<'a, T> {
    actual_ref: &'a T
}

But you've said that 'unsafe is shorter than any arbitrary lifetime 'a, so even if 'unsafe were introduced, it wouldn't be usable in this position.

Dec 02 '21 23:12 Diggsey

it wouldn't be usable in this position.

It wouldn't be safely useable in that position. The idea is that you are providing a way to instantiate a lifetime-generic type for use with unsafe code.

It is just as usable as a *const T because neither do much without additional unsafe code.

Dec 03 '21 00:12 maboesanman

A potentially useful example where I wanted something like this: docs.rs/yoke technically uses 'static as a stand-in for 'unsafe: it needs to be able to talk about such lifetimes generically

Dec 03 '21 03:12 Manishearth

because 'unsafe is shorter than any lifetime, a function which is generic over some lifetime parameter expects something which could be assigned a lifetime parameter.

'static >= 'a > 'unsafe

for all values of 'a. if instead we had 'a >= 'unsafe, you would be correct.

So, we have to update the document to say that for<'a> generic actually means "for all SAFE lifetimes", not "for ALL lifetimes". This creates a sort of confusion to the language learners.

Dec 03 '21 08:12 earthengine

I'm not convinced that this is a good idea, for the following reasons:

The documentation for &T states that “a reference is just a pointer that is assumed to be aligned, not null, and pointing to memory containing a valid value of T”. This will no longer be true if a 'unsafe lifetime is added.
It causes mental overhead, because it adds lots of edge cases:
- References can always be safely derefenced except for &'unsafe T
- for<'a> works with any lifetime except 'unsafe
- A lifetime parameter of a function/impl/trait can be instantiated with any lifetime, except 'unsafe
  (but instantiating a struct's lifetime parameter with 'unsafe is fine apparently)
It makes Rust harder to learn and to teach.
I'm not convinced that it's the best solution for the problem.

To elaborate my last point: The only use case mentioned in the RFC are self-referential structs. If these are the main focus, then a 'self lifetime could be considered as an alternative. Another alternative that the RFC should talk about is to "do nothing".

Dec 04 '21 03:12 Aloso

'unsafe is shorter than any lifetime

The RFC says that 'unsafe is shorter than any other lifetime. Is 'unsafe shorter than 'unsafe or not?

This shouldn't have an effect on behavior because of rule 2, but I think it matters in how we justify the behavior of unsafe lifetimes.

'unsafe cannot be used in place of another lifetime in function signatures because it is shorter than any lifetime, so no function that is expecting a normal lifetime can be called with 'unsafe instead. If you want to call that function you must transmute into a real lifetime, which is unsafe.

This looks to me like it's making the transmute function a special case or you wouldn't be able to call it on something with an unsafe lifetime. If that is the case, then the RFC should mention this. If not, then I wonder what you mean by "function that is expecting a normal lifetime".

Dec 05 '21 01:12 oskgo

I believe to make this work and be useful, it must be possible to opt into 'unsafe when declaring lifetime parameters:

// any lifetime except 'unsafe:
fn foo<'a>(x: &'a i32) {}

// any lifetime, including 'unsafe:
fn bar<'a: 'unsafe>(x: &'a i32) {
    // unsafe is needed here to dereference x!
}

foo::<'unsafe>(&42); // forbidden
bar::<'unsafe>(&42); // allowed

'a: 'unsafe, as in "'a outlives 'unsafe", is trivially true if 'unsafe: 'unsafe, so this bound would have to have a special meaning.

Dec 05 '21 02:12 Aloso

I believe to make this work and be useful, it must be possible to opt into 'unsafe when declaring lifetime parameters:

This is pretty similar to Sized. A normal generic type parameter is considered Sized unless you explicitly say T: ?Sized.

Dec 05 '21 04:12 earthengine

Going back to @burdges comment here:

Interesting, I'd missed this aspect, thanks. If I understand, &'unsafe T could equally be called *aligned const T, no? What is the *mut T analog of &'unsafe mut T?

In terms of the expressive power that this brings to the type system, what is missing from the taxonomy of pointers is a guaranteed-aligned pointer type without a lifetime. As the last sentence of this RFC mentions, that would be quite useful for any data structure implementation or FFI code to declare at the type level that pointer is "just" aligned (and maybe also non-null).

I can see benefit from the restricted form 'self for references, but perhaps we should have *aligned const T/*aligned mut T or core::ptr::Aligned<T> for the general case. Is there possibility for safe use of &'self T if it is more restricted than the current 'unsafe proposal?

Dec 07 '21 16:12 ghost

for core::ptr::Aligned<T>, see also #3204

Dec 07 '21 18:12 programmerjake

I had originally posted these concerns on the Zulip, but that conversation has died down so I'll repost here

In my opinion, there's an aspect of this that is way under-specified and probably a massive issue, and that is the implications of this for type checking. Thinking in terms of the type system for a second, its clear that 'unsafe can't actually mean "the shortest lifetime" because that would be incredibly unsound for contra-variant lifetimes. Instead, it has to be some kind of non-lifetime that can be used in place of a lifetime, but isn't a lifetime at all. What does this mean for type checking though? Consider, for example
trait A {
    type Assoc;
    
    fn get(&self) -> Self::Assoc;
}

impl<'a> A for &'a i32 {
    type Assoc = i32;

    fn get(&self) -> i32 {
        **self
    }
} 

struct S<'a> {
    v: <&'a i32 as A>::Assoc,
}

fn f(s: S<'unsafe>) {
     // what happens here?
}
How is the behavior of the type checker meant to change in the body of f? In the past, it would have been allowed to use S<'unsafe> being well-formed to conclude that <&'unsafe i32 as A>::Assoc is well-formed, and hence &'unsafe i32: A. But that's not the case! In other words, getting this kind of change through requires fundamentally changing the rules for type checking, at least around this 'unsafe lifetime, and exactly how that is to work needs to be 1) a part of the RFC, and 2) designed with extreme care to ensure safety guarantees are upheld

To be clear: Enforcing that lifetime generics in scope for functions is enough, as far as I can tell, to ensure the continued soundness of any existing code. What is not clear at all is how this should work in a way that doesn't lead the trait solver to make incorrect deductions.

Maybe the particular example above can be fixed by deciding that either the "S<'unsafe> well-formed implies <&'unsafe i32 as A>::Assoc is well formed" or the "<&'unsafe i32 as A>::Assoc is well formed implies &'unsafe i32: A" deductions are incorrect, but which one, and why? Furthermore, can you prove that this is enough in general? What are the side-effects?

I do think there's genuinely a good idea here, and that this kind of type would be useful even outside of unsafe code, but the right process would probably be to think more about the motivation and use cases, and then file a lang MCP so that the work to design the resulting type system correctly can be put in.

Dec 07 '21 19:12 JakobDegen

I believe to make this work and be useful, it must be possible to opt into 'unsafe when declaring lifetime parameters:

This is pretty similar to Sized. A normal generic type parameter is considered Sized unless you explicitly say T: ?Sized.

I think this can be used to address @JakobDegen 's concerns, as well as clarify the discrepancy between lifetime parameters' bounds in impls/fns vs structs/enums. This also gets around the backwards compatibility requirement that the lifetime uses a keyword name.

add one new lifetime and two new lifetime bounds:

'unsafe

'a: '!unsafe and 'a: '?unsafe

'a: '!unsafe would be implicit on any lifetime parameter introduced by an implicit block or by a function. A notable implication of this is that T<'unsafe> doesn't impl the traits that T<'a> does, unless:

'a: '?unsafe would usable on lifetime parameters introduced on functions or impl blocks, opting out of the implicit bound above, allowing traits to be implemented for types instantiated with the unsafe lifetime.

the reverse is the case for types:

'a: '?unsafe would be implicit for all lifetime parameters defined in a struct or enum. If you want your struct to opt out of this behavior, you can use 'a: '!unsafe (it's not clear to me why this would be required, so possibly the '!unsafe bound could be avoided completely.

To avoid naming collisions, 'unsafe (or whatever it is actually called) could be able to be shadowed by lifetime parameters with a warning.

A notable name suggestion from Zulip is 'erased ('a: '?erased, 'a: '!erased)

Dec 09 '21 17:12 maboesanman

@maboesanman what you are describing there is, to me at least, not new, and this is how I had been interpreting things already. (We may decide later that we don't actually want to allow people to specify non-default constraints, but that's a separate issue). The example I posted still has issues, since it shows one (but not all) ways to turn a lifetime generic on a type into a lifetime generic on an impl.

Dec 09 '21 17:12 JakobDegen

@JakobDegen the type of v is invalid because the the lifetime 'a is '?unsafe but it is required to be '!unsafe in order for the <&'a i32 as A> coercion to work.

But your example proves that both struct and impl lifetime parameters need to be '!unsafe, which means only types which explicitly allow instantiating with the unsafe lifetime can be used, which is still useful.

Maybe the struct/enum explicit bound can be removed on an edition bump?

Dec 09 '21 18:12 maboesanman

rfcs rfcs copied to clipboard

RFC: Unsafe Lifetime

rfcs
rfcs copied to clipboard