rust icon indicating copy to clipboard operation
rust copied to clipboard

Tracking Issue for `lazy_cell`

Open tgross35 opened this issue 1 year ago • 33 comments

This supercedes #74465 after a portion of once_cell was stabilized with #105587

Feature gate: #![feature(lazy_cell)]

This is a tracking issue for the LazyCell and LazyLock types, which are primatives for one-time initialization. One of the main goals is to be able to replace the lazy_static crate.

Public API

// core::cell (in core/src/cell/lazy.rs)

pub struct LazyCell<T, F = fn() -> T> { /* ... */ }

impl<T, F: FnOnce() -> T> LazyCell<T, F> {
    pub const fn new(init: F) -> LazyCell<T, F>;
    pub fn force(this: &LazyCell<T, F>) -> &T;
}

impl<T, F: FnOnce() -> T> Deref for LazyCell<T, F> {
    type Target = T;
}

impl<T: Default> Default for LazyCell<T>;
impl<T: fmt::Debug, F> fmt::Debug for LazyCell<T, F>;
// std::sync (in std/sync/lazy_lock.rs)

pub struct LazyLock<T, F = fn() -> T> { /* ... */ }

impl<T, F: FnOnce() -> T> LazyLock<T, F> {
    pub const fn new(f: F) -> LazyLock<T, F>;
    pub fn force(this: &LazyLock<T, F>) -> &T;
}

impl<T, F> Drop for LazyLock<T, F>;
impl<T, F: FnOnce() -> T> Deref for LazyLock<T, F> {
    type Target = T;
}
impl<T: Default> Default for LazyLock<T>;
impl<T: fmt::Debug, F> fmt::Debug for LazyLock<T, F>;

// We never create a `&F` from a `&LazyLock<T, F>` so it is fine
// to not impl `Sync` for `F`
unsafe impl<T: Sync + Send, F: Send> Sync for LazyLock<T, F>;
// auto-derived `Send` impl is OK.
impl<T: RefUnwindSafe + UnwindSafe, F: UnwindSafe> RefUnwindSafe for LazyLock<T, F>;
impl<T: UnwindSafe, F: UnwindSafe> UnwindSafe for LazyLock<T, F>;

Steps / History

  • [ ] Implementation: #72414
  • [ ] Final comment period (FCP)^1
  • [ ] Stabilization PR

Unresolved Questions

  • [ ] Is variance of Lazy correct? (See https://github.com/matklad/once_cell/issues/167)
  • [ ] Default F = fn() -> T in type signature (See https://github.com/rust-lang/rust/issues/109736#issuecomment-1489417892)

tgross35 avatar Mar 29 '23 19:03 tgross35

cc original authors @matklad and @KodrAus, just for reference

tgross35 avatar Mar 29 '23 21:03 tgross35

The biggest design question here is the default parameter: F = fn() -> T

  • It is a hack to make static MY_DATA: Lazy<MyType> = ... syntax work.
  • One can imagine static MY_DATA: Lazy<MyType, _> working one day, but at this point it seems more likely than not that we won't ever implement this kind of inference, and, even if we end up implementing something like that, it would be years in the future.
  • The hack works out nicely in 99% of the cases, but it can create confusion when using Lazy for a local variable:
let env = "hello".to_string();

let ok1 = Lazy::new(|| env);
let ok2: Lazy<String, _> = Lazy::new(|| env);

let err: Lazy<String> = Lazy::new(|| env);
// ^ The confusing case. The problem here is that type of `F` isn't inferred, 
/// but is taken from the default

matklad avatar Mar 29 '23 22:03 matklad

It is kind of weird to have the F be a part of the type definition at all. It makes sense of course as far as needing to store the function type in the struct, but it's a bit clunky for type signatures.

Is there a workaround using with raw function pointers maybe? Since the signature is always known, but I'm not sure how things like closure coercions & lifetimes would work here.

Or maybe dynamics could work. I haven't thought it totally through (and there might be lifetime trickiness) but at least the std LazyLock could store a Box<dyn FnOnce() -> T>

tgross35 avatar Mar 29 '23 23:03 tgross35

Boxing the contents would make it unusable for static values which would render it useless for most use cases.

Sp00ph avatar Mar 30 '23 01:03 Sp00ph

Yeah, that is a good point. And &dyn isn't very elegant.

Just spitballing here... taking a fn() -> T would eliminate the second parameter. This would unfortunately mean that it can't take environment-capturing closures, blocking the examples matklad pointed out. At least for LazyLock though, I can't really envision any use cases where this would be desired anyway. And it would allow for a nonbreaking upgrade to FnOnce in the future, if there is ever a better way.

pub struct TestLazyLock<T> {
    cell: OnceLock<T>,
    init: Cell<Option<fn() -> T>>,
}

impl<T> TestLazyLock<T> {
    const fn new(init: fn() -> T) -> Self {
        Self { cell: OnceLock::new(), init: Cell::new(Some(init)) }
    }
}

playground

tgross35 avatar Mar 30 '23 01:03 tgross35

I don't see how that would be preferable over just having the second type parameter with the fn() -> T default. Granted, the case mentioned by @matklad currently doesn't provide a very good diagnostic, but if the compiler just suggested to add the _ as the second type parameter then that point of confusion would probably also largely disappear.

Sp00ph avatar Mar 30 '23 01:03 Sp00ph

Regarding https://github.com/rust-lang/rust/pull/106152, does anyone know if the initialization state was explicitly excluded from the API? That was @Amanieu's concern.

SUPERCILEX avatar Mar 30 '23 02:03 SUPERCILEX

Yeah, it's not preferable. Just trying to see if there's any way where we could either

  1. not have that generic parameter, or
  2. make it so users can never write LazyCell<T, _> or LazyCell<T, fn() -> T> - so we could eventually drop the second parameter in a background-compatible way. I don't think this is possible via sealing or anything, but maybe there's a tricky way.

To quote @m-ou-se in https://github.com/rust-lang/rust/issues/74465#issuecomment-725462718

It's a bit of a shame that Lazy uses a fn() -> T by default. With that type, it needlessly stores a function pointer even if it is constant. Would it require big language changes to make it work without storing a function pointer (so, a closure as ZST), while still being as easy to use? Maybe if captureless closures would implement some kind of const Default? And some way to not have to name the full type in statics. That's probably not going to happen very soon, but it'd be a shame if this becomes possible and we can't improve Lazy because the fn() -> T version was already stabilized. Is there another way to do this?

I think that the form I suggested above with TestLazyLock<T> would be forward-compatible with either something like what Mara is suggesting, or with the current form (could use a sanity check here). It's not as useful as the current full featured version, but it does directly replace the purpose of lazy_static, which is kind of the biggest target of this feature. So in theory, that could be stabilized while a more full featured version is being contemplated.

tgross35 avatar Mar 30 '23 03:03 tgross35

Regarding https://github.com/rust-lang/rust/pull/106152, does anyone know if the initialization state was explicitly excluded from the API? That was @Amanieu's concern.

I am not super in the know for this, but I don't think there's any particular reason the state couldn't be exposed somehow. The state is known by the OnceCell, and would have to be tracked somehow even with a different underlying implementation.

tgross35 avatar Mar 30 '23 03:03 tgross35

make it so users can never write LazyCell<T, _> or LazyCell<T, fn() -> T> - so we could eventually drop the second parameter in a background-compatible way

I somewhat doubt it would ever be necessary to make such a change (the only edge case is rather contrived), but even then the way to drop the second parameter in a backward-compatible way would be to introduce a new API, deprecate this one, and upgrade everyone with cargo fix. I wouldn't stress about the existence of this parameter.

bstrie avatar Mar 30 '23 20:03 bstrie

If "impl trait" is allowed in let/static/const https://github.com/rust-lang/rust/issues/63065 then the additional parameter is not an issue anymore.

If the Lazy is initialized with a closure, then using "impl trait" in static would actually reduce size of the global variable by one function pointer.

NobodyXu avatar Mar 31 '23 00:03 NobodyXu

@NobodyXu I'm not too familar with that feature... would it also allow these impl definitions in structs? Or what would this look like? I'm imagining something like this, which would be quite cool

type LazyInitFn<T> =  impl FnOnce() -> T + Send + ?Sized;

pub struct LazyLock<T> {
    cell: OnceLock<T>,
    init: Cell<Option<LazyInitFn<T>>>,
}

impl<T> LazyLock<T> {
    const fn new(init: LazyInitFn<T>) -> Self { /* ... */ }
}

But I haven't seen any examples in the RFC that do this

tgross35 avatar Mar 31 '23 01:03 tgross35

@NobodyXu I'm not too familar with that feature... would it also allow these impl definitions in structs? Or what would this look like? I'm imagining something like this, which would be quite cool

Oh you are right, I missed that.

According to my understanding, it enables something like this:

static F: impl FnOnce() -> u32 = || 1;

I was thinking about:

static F: Lazy<impl FnOnce() -> u32> = Lazy::new(|| 1);

Which might not be covered by the tracking issue I linked.

NobodyXu avatar Mar 31 '23 01:03 NobodyXu

Added https://github.com/matklad/once_cell/issues/167 as an unresolved quesrion

matklad avatar Apr 02 '23 16:04 matklad

Is there a reason force is an associated function rather than taking &self? No specific comment, just curious.

tgross35 avatar Apr 03 '23 06:04 tgross35

I presume to avoid shadowing a .force() method on the inner value.

bjorn3 avatar Apr 03 '23 07:04 bjorn3

Makes sense. It does sort of remind me of the discussion on Mutex::unlock, but I do think it makes more sense here.

tgross35 avatar Apr 03 '23 07:04 tgross35

Unlike Mutex, LazyCell is newly added to std. Why will it still suffer from the possible shadowing?

slanterns avatar Apr 03 '23 07:04 slanterns

Bjorn meant that if whatever the LazyCell derefs to (T) has a .force method, then that would overlap with the LazyCells's own .force.

I just linked Mutex::unlock because it was recently decided against for just being a synonym for drop, and it has a similar signature to LazyCell::force - which is a synonym for deref. Just something to consider whether any arguments against unlock might apply here: I don't think they do, since force isn't trying to encourage a way around any usage patterns (unlock was sort of an escape hatch for RAII).

tgross35 avatar Apr 03 '23 08:04 tgross35

This works on nightly, if TAIT is enabled:

type F = impl FnOnce() -> ();
static L: LazyLock<(), F> = LazyLock::new(|| ());

nbdd0121 avatar Apr 06 '23 09:04 nbdd0121

The lazy_cell_consume feature was added as part of this tracking issue: https://github.com/rust-lang/rust/pull/106152. It is part of the Lazy* API, but does not have to be stabilized with the base feature set.

SUPERCILEX avatar Apr 24 '23 00:04 SUPERCILEX

On the topic of variance: Is it sound for T to be covariant in LazyCell<T, F>? I feel like it should be, since LazyCell<T, F> seems equivalent to &T wrt variance and &T is covariant.

Edit: This consideration is probably more relevant to OnceCell, since the variance of T in LazyCell<T, F> depends (as far as I can tell) on the variance of T in OnceCell<T>.

FeldrinH avatar May 22 '23 11:05 FeldrinH

Not sure where else to put this, so here goes: is there any reason std::cell::LazyCell does not have mutability-focused functions like force_mut that are in once_cell? What would it take to get those added in?

GregoryConrad avatar Jun 25 '23 00:06 GregoryConrad

Any blockers for the stabilization of the lazy_cell in std?

photino avatar Jul 09 '23 06:07 photino

The biggest design question here is the default parameter: F = fn() -> T

Though it's still an experiment, using dyn_star feature there's still potential for removing F from generics parameter list completely, replacing it with dyn* FnOnce() -> T

crlf0710 avatar Jul 15 '23 08:07 crlf0710

Same question as https://github.com/rust-lang/rust/issues/109736#issuecomment-1605787094 which didn't seem to get an answer. How come LazyCell doesn't implement DerefMut?

ia0 avatar Nov 24 '23 10:11 ia0

@ia0 adding stuff to std is a costly, high stakes process. For this reason:

  • additions usually start maximally minimal --- the smallest possible API surface is stabilized
  • subsequently, API additions are introduced, but they are done in piecemeal fashion, one thing at a time, and follow the FCP process.

This is in contrast to crates.io crates, which can generally add APIs comparatively willy-nilly, as there's always an option to release a new major version (or even to switch to a new crate altogether).

In this specific case, force_mut is a niche API, which is perhaps useful, but, given that the base type isn't stable yet, I don't think it's a priority for anyone at this point.

That being said, if someone feels strongly that the nightly type should have this API today, they could familiarize themselves with the feature lifecycle and start doing the work necessary to get that API in. I don't think there's a fundamental reason why std types should not have these APIs

https://std-dev-guide.rust-lang.org/development/feature-lifecycle.html

But keep in mind that the bottleneck resource here is the libs team decision making capacity: there's a lot of in-progress niche APIs, and the team has largish backlog.

matklad avatar Nov 24 '23 11:11 matklad

Thanks @matklad ! Good to know the feature makes sense and it's only a question of time and resource. I was also afraid it would be an incorrect feature (either unsound or not appropriate for the purpose of LazyCell).

Just to explain why I was surprised such an (obvious to me) feature is not present which may help any decision makers. I have a problem (which I'll describe below) and search for a solution in the standard library before implementing my own. I find LazyCell which name and documentation ("A value which is initialized on the first access.") seems to address my problem, only to realize it actually doesn't solve it.

Now to my problem. I have a function that looks like this:

let mut session = Session::connect("address")?; // costly
if should_foo {
    session.foo()?; // takes &mut self
}
if should_bar {
    session.bar()?; // takes &mut self
}

I would like to avoid creating the session if neither foo nor bar are necessary. Said otherwise, I'd like to make the session connection lazy.

Note that for my problem, it's very easy to implement something safe and tailored, that wouldn't be of any use in the standard library due to its lack of generality:

pub enum Lazy<T, F: FnOnce() -> Result<T>> {
    Uninit(F),
    Init(T),
    Empty,
}

impl<T, F: FnOnce() -> Result<T>> Lazy<T, F> {
    pub fn new(init: F) -> Self {
        Lazy::Uninit(init)
    }

    pub fn get(&mut self) -> Result<&mut T> {
        if let Lazy::Init(x) = self {
            return Ok(x);
        }
        match std::mem::replace(self, Lazy::Empty) {
            Lazy::Uninit(f) => *self = Lazy::Init(f()?),
            _ => unreachable!(),
        };
        self.get()
    }
}

I can now write:

let mut session = Lazy::new(|| Ok(Session::connect("address")?));
if should_foo {
    session.get()?.foo()?;
}
if should_bar {
    session.get()?.bar()?;
}

If LazyCell would implement DerefMut, a similar solution would work using LazyCell<Result<Session>> and .as_deref_mut()? on usage.

All that said, I understand the initial goal of LazyCell is different and those use-cases would benefit from a faster stabilization. This is completely fine because it should be possible to extend the current API to support the mutable use-case without breaking anything. In the meantime I'll just use my own solution, there's no rush.

Thanks again for your quick feedback and clarification!

ia0 avatar Nov 24 '23 13:11 ia0

I'd say this particular use-case should be served with Option, rather than a Lazy --- the entire shtick of Lazy/OnceCell is providing lazy semantics with just a &. If you already have &mut, or need a &mut access to the internals, the right primitive to use, in terms of appropriate internal moving parts, is an Option. Option also has lazy-orientied APIs, like

https://doc.rust-lang.org/stable/std/option/enum.Option.html#method.get_or_insert_with

though it doesn't work with ? yet.

Which might be the strongest argument for not including deref_mut --- while there are cases where it is genuinely useful (when you have both & and &mut access to the lazy at different points in time), most of the time the Option is the right conceptual answer.

matklad avatar Nov 24 '23 13:11 matklad

I see. So in the end it was actually a purpose mismatch (which is probably why it's called cell::LazyCell instead of lazy::Lazy).

But I would argue that Option is not the right conceptual answer either. You don't want to mention the initialization closure at each call site, only at the definition site. (In particular, you don't want it to be Clone defeating the FnOnce.) Right now you can't use the standard library for this problem. You need a custom enum with 3 variants (you could get rid of the sentinel variant by using replace_with or a similar crate until that ends up in the standard library too). But eventually I would expect something like that to end up in the standard library.

That said, I start to be less and less convinced that LazyCell should be under cell. The fact that there's interior mutability is an implementation detail, not a feature and thus not part of the API. The correct primitive should be lazy::Lazy which happens to have some &self methods actually mutating the object, but in a way transparent to the user. This type is not supposed to provide mutability through a shared reference, which would be what I would expect from a type under cell. In particular, Cell, RefCell, UnsafeCell, OnceCell, all provide mutability under shared reference at the user level. LazyCell does not. At the user level, the object is not mutable unless you have exclusive access. I'm in no position to recommend anything, but I can give my opinion which is that this type should move out of cell, be it a new lazy module or something else conveying the intent of deferring work until actually needed, which is a different concept than mutability with shared access only. I think this is actually quite important and should prevent stabilization of this feature as currently designed.

ia0 avatar Nov 24 '23 14:11 ia0